CN117939003A - Abnormal number identification method and device, electronic equipment and storage medium - Google Patents

Abnormal number identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117939003A
CN117939003A CN202410154128.7A CN202410154128A CN117939003A CN 117939003 A CN117939003 A CN 117939003A CN 202410154128 A CN202410154128 A CN 202410154128A CN 117939003 A CN117939003 A CN 117939003A
Authority
CN
China
Prior art keywords
abnormal
target
training
recognition model
sample set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410154128.7A
Other languages
Chinese (zh)
Inventor
成雪腾
刘杰
常福慧
杨宁
张依诺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202410154128.7A priority Critical patent/CN117939003A/en
Publication of CN117939003A publication Critical patent/CN117939003A/en
Pending legal-status Critical Current

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The present disclosure provides an abnormal number identification method, an apparatus, an electronic device, and a storage medium, where the method includes: acquiring a number to be identified; acquiring call data of the number to be identified in a preset time period; and analyzing the call data based on a trained target recognition model to obtain a target abnormal recognition result of the number to be recognized, wherein the target recognition model is obtained based on a training sample set and by taking whether the number is an abnormal number and/or whether the number is turned off and not returned as a training target, the training sample set is obtained according to a number sample accessed to an abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessed to the target abnormal application, so that the training sample set is obtained by means of broadband related information, training is carried out according to different training targets, the coverage rate and accuracy of recognition are improved, and the missed detection possibility is reduced.

Description

Abnormal number identification method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the field of computer technology, and in particular, to a method and device for identifying an abnormal number, an electronic device and a storage medium.
Background
At present, a user may need to identify and confirm the identity of a communication counterpart in the process of using terminal equipment to communicate, so as to ensure safety.
Disclosure of Invention
The disclosure provides an abnormal number identification method, an abnormal number identification device, electronic equipment and a storage medium, so as to solve the problem of missed detection caused by low abnormal number detection coverage in the prior art.
In a first aspect, the present disclosure provides an abnormal number identification method, including:
Acquiring a number to be identified;
acquiring call data of the number to be identified in a preset time period;
and analyzing the call data based on a trained target recognition model to obtain a target abnormal recognition result of the number to be recognized, wherein the target recognition model is obtained based on a training sample set and by training whether the target recognition model is an abnormal number and/or whether the target recognition model is a shut-down and non-multiplexed number as a training target, the training sample set is obtained according to a number sample of an access abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessing an abnormal application of the target.
In a possible embodiment, the obtaining the number to be identified includes:
Acquiring first broadband behavior data;
screening out a target broadband account number for accessing the target abnormal application according to the first broadband behavior data;
And acquiring the number accessed to the target broadband account, and determining the number to be identified according to the acquired number accessed to the target broadband account.
In a possible embodiment, the target recognition model includes a first recognition model and a second recognition model, and the analyzing the call data based on the trained target recognition model to obtain the target anomaly recognition result of the number to be recognized includes:
Analyzing the call data based on the first recognition model to obtain a first abnormal recognition result of the number to be recognized, wherein the first recognition model is obtained after training by taking whether the number to be recognized is an abnormal number as a training target;
Analyzing the call data based on the second recognition model to obtain a second abnormal recognition result of the number to be recognized, wherein the second recognition model is obtained after training whether the number is an abnormal number or whether the number is turned off and the number which is not reset is a training target;
And determining a target abnormal recognition result of the number to be recognized according to the first abnormal recognition result and the second abnormal recognition result.
In a possible embodiment, the determining the target anomaly identification result of the number to be identified according to the first anomaly identification result and the second anomaly identification result includes:
Determining that the number to be identified is an abnormal number under the condition that the first abnormal identification result and the second abnormal identification result both represent that the number to be identified is an abnormal number; or alternatively, the first and second heat exchangers may be,
And under the condition that the first abnormal recognition result or the second abnormal recognition result represents that the number to be recognized is an abnormal number, determining that the number to be recognized is an abnormal number.
In a possible embodiment, the training manner of the target recognition model is:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
And carrying out feature extraction on call data samples in the training sample set aiming at the training sample set to obtain call feature information, and training a target recognition model according to the call feature information and the corresponding mark of whether the call feature information is an abnormal number or whether the call feature information is a shut-down and non-reset number, so as to obtain the trained target recognition model.
In a possible embodiment, the method further comprises:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
Based on the training sample set, carrying out feature extraction on call data samples in the training sample set to obtain first call feature information, and training a first recognition model according to the first call feature information and the marks corresponding to whether the first call feature information is an abnormal number or not to obtain a trained first recognition model;
And carrying out feature extraction on call data samples in the training sample set based on the training sample set to obtain second call feature information, and training a second recognition model according to the second call feature information and the corresponding mark of whether the second call feature information is an abnormal number or whether the second call feature information is a shut-down and non-reset number, so as to obtain a trained second recognition model.
In a possible embodiment, the acquiring a training sample set includes:
Acquiring second broadband behavior data;
screening out an abnormal broadband account number for accessing the target abnormal application according to the second broadband behavior data;
And obtaining a number sample of the abnormal broadband account accessed in the first time period, obtaining a call data sample of the number sample in the second time period, and judging whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
In a second aspect, the present disclosure provides an abnormal number recognition apparatus, including:
the first acquisition module is used for acquiring the number to be identified;
the second acquisition module is used for acquiring call data of the number to be identified in a preset time period;
The identification module is used for analyzing the call data based on a trained target identification model to obtain a target abnormal identification result of the number to be identified, wherein the target identification model is obtained based on a training sample set and by taking whether the number is an abnormal number and/or whether the number is turned off and not multiplexed as a training target, the training sample set is obtained according to a number sample accessed to an abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessed to a target abnormal application.
In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, one or more of the computer programs being executable by the at least one processor to enable the at least one processor to perform the anomaly number identification method described above.
In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the above-described abnormal number identification method.
In a fifth aspect, the present application provides a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, performs the above-described method of identifying an abnormal number.
In the embodiment of the disclosure, a number to be identified is obtained; acquiring call data of the number to be identified in a preset time period; and analyzing the call data based on a trained target recognition model to obtain a target abnormal recognition result of the number to be recognized, wherein the target recognition model is obtained based on a training sample set and is trained by taking whether the number is an abnormal number and/or whether the number is turned off and not a machine number is a training target, the training sample set is obtained according to a number sample accessed to an abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessed to the abnormal application of the target, so that the training sample set is obtained by utilizing broadband related information and training is performed by using different training targets, the recognition coverage rate and accuracy of the target recognition model are improved, and the missed detection possibility is reduced.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:
fig. 1 is a flowchart of an abnormal number identification method provided in an embodiment of the present disclosure;
fig. 2 is a flowchart of a training method of a target recognition model in an embodiment of the disclosure:
Fig. 3 is a block diagram of an abnormal number recognition apparatus according to an embodiment of the present disclosure;
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In the technical scheme of the application, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order is not violated. The use of user data in the technical scheme complies with national relevant laws and regulations (for example, information security technology personal information security standards, etc.). Such as: the personal information access control takes corresponding prescribed measures; presentation of personal information gives regulatory restrictions; the personal information is not used beyond the direct or reasonable association range; the definite identity directivity is eliminated when personal information is used, and accurate positioning to specific individuals is avoided.
At present, a user may need to identify and confirm the identity of a communication counterpart in the process of using a terminal device to ensure safety, for example, some people access a homemade Application (APP) by using a mobile phone, and further forge the homemade APP into a virtual number to dial a call to a common user, and the caller identification of the common user cannot display the real mobile phone number.
Therefore, in the embodiment of the disclosure, in view of this, an abnormal number identification method is provided, a training sample set is obtained based on a call data sample of a number sample of an access abnormal broadband account, and based on the training sample set, and a target identification model is obtained after training with whether the number is an abnormal number and/or whether the number is a shutdown and no-recovery number as a training target, further, call data of the number to be identified in a preset time period is obtained for the number to be identified, and a target abnormal identification result of the number to be identified is obtained based on the trained target identification model, so that the target identification model is trained with the training sample set determined based on the broadband behavior data, and whether the number is an abnormal number and/or whether the number is a shutdown and no-recovery number as a training target, a coverage scene of the target identification model for the abnormal number identification is improved, and an identification coverage rate is improved, thereby the possibility of omission is reduced, and the identification accuracy is improved.
For the convenience of understanding the embodiments of the present disclosure, first, a detailed description will be given of an abnormal number identification method disclosed in the embodiments of the present disclosure, and an execution body of the abnormal number identification method provided in the embodiments of the present disclosure is generally an electronic device with a certain computing capability. The electronic device includes, for example: the terminal device may be a vehicle-mounted device, a User Equipment (UE), a mobile device, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, a wearable device, or the like, and the server may be an independent physical server, a server cluster composed of a plurality of physical servers, or a cloud server capable of performing cloud computing. In some possible implementations, the abnormal number identification method may be implemented by a processor invoking computer readable instructions stored in a memory.
The abnormal number identification method provided in the embodiment of the present disclosure will be described below by taking an execution subject as a server as an example.
Fig. 1 is a flowchart of an abnormal number identification method according to an embodiment of the present disclosure, as shown in fig. 1, where the method includes:
s101: and obtaining the number to be identified.
In the embodiment of the present disclosure, the method for identifying the abnormal number may be applied to a batch detection scenario, for example, the obtained batch numbers to be identified are detected respectively at regular intervals, or may be applied to any time required to be detected according to the needs, and the number to be identified may be any telephone number or mobile phone number required to be detected abnormally, which is not limited in the embodiment of the present disclosure.
In a possible implementation manner, the step S101 of obtaining the number to be identified includes: acquiring first broadband behavior data; screening out a target broadband account number for accessing the target abnormal application according to the first broadband behavior data; and acquiring the number of the access target broadband account, and determining the number to be identified according to the acquired number of the access target broadband account.
In the embodiment of the disclosure, considering that in a call scene using an abnormal identity by using a homemade APP, a broadband access network may be used to access the homemade APP, so in the embodiment of the disclosure, the homemade APP may be collected in advance, that is, a plurality of target abnormal applications may be obtained, and further, a target broadband account for accessing the target abnormal applications may be screened out by using first broadband behavior data, and may be considered as a suspected abnormal broadband account, and then the number of the target broadband account may be accessed, and may also be considered as a suspected abnormal number.
In this way, in the embodiment of the disclosure, based on the first broadband behavior data, some numbers suspected to be abnormal can be screened out first, then the identification is performed through the target identification model, so that the identification efficiency can be further improved, and the numbers suspected to be abnormal can be screened and identified in time, so that the safety is improved.
S102: and acquiring call data of the number to be identified in a preset time period.
The preset time period may be set according to experience and practical situations, for example, the preset time period is set within the last seven days, that is, call data of the number to be identified within the last seven days is obtained, and the call data includes, for example, call times, calling party, called party, call duration, geographical information of the calling party and the called party, call time, and the like.
Further, after call data is obtained, in the embodiment of the present disclosure, the call data may be preprocessed first, and then the target recognition model is input for recognition, so that processing efficiency and accuracy of model recognition may be improved, specifically, preprocessing includes operations such as data cleaning, feature extraction, for example, deleting data with incorrect meaning and null unknown meaning for the obtained call data, and performing data analysis, to extract call features required by the target recognition model, for example, counting a call duty ratio of a number to be recognized as a calling party, a position distribution of a called party called by the number to be recognized as the calling party, a call duty ratio of friends of the called Fang Fei, and call duration, which are not limited in the embodiment of the present disclosure.
S103: and analyzing call data based on a trained target recognition model to obtain a target abnormal recognition result of the number to be recognized, wherein the target recognition model is obtained based on a training sample set and after training whether the target recognition model is an abnormal number and/or whether the target recognition model is a shut-down and non-multiplexed number is a training target, the training sample set is obtained according to a number sample accessed to an abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessed to an abnormal application of the target.
In an embodiment of the disclosure, a target recognition model may be trained in advance, and in a possible implementation manner, only one target recognition model needs to be trained, and then a target abnormal recognition result of a number to be recognized is obtained based on the target recognition model, where the target abnormal recognition result is, for example, a normal number, an abnormal number, and other categories.
In another possible implementation manner, in order to further improve the recognition accuracy and coverage, training is performed based on different training targets to obtain different models, specifically, the target recognition model includes a first recognition model and a second recognition model, and then step S103 analyzes call data based on the trained target recognition model to obtain a target anomaly recognition result of the number to be recognized, including:
1) And analyzing the call data based on a first recognition model to obtain a first abnormal recognition result of the number to be recognized, wherein the first recognition model is obtained after training by taking whether the number is an abnormal number as a training target.
The first recognition model may be obtained after training based on a training sample set marked as a normal number and a training sample set marked as an abnormal number, that is, training is performed with a positive training sample set marked as a normal number and with a negative training sample set marked as an abnormal number.
2) And analyzing the call data based on a second recognition model to obtain a second abnormal recognition result of the number to be recognized, wherein the second recognition model is obtained after training whether the number is an abnormal number or whether the number is turned off and the number which is not multiplexed is a training target.
In the embodiment of the disclosure, the shutdown and un-shutdown number may be considered as an abnormal number caused by the abnormal identity in the abnormal identity identification scene, because the related department or institution processes the shutdown of a certain number, possibly due to the abnormality of the number, of course, there may be various reasons for the abnormality, such as arrearage, un-real name authentication, abnormal number confirmed as the abnormal identity, etc., for the shutdown caused by the un-abnormal identity, the user usually confirms and processes to resume the use of the number, but for the shutdown caused by the abnormal identity, the user usually does not perform the shutdown, so that during the training, the normal number marked as the normal number is used as the positive training sample set, and the abnormal number marked as the abnormal number is used as the negative training sample set marked as the shutdown and un-shutdown number is trained, thereby obtaining the second identification model after the training.
Therefore, the trained second recognition model can be used for recognizing whether the number is an abnormal number or not, can also recognize whether the number is a shutdown number or not and is not a multiplex number, can screen more abnormal numbers, and improves the diversity of coverage scenes.
3) And determining a target abnormal recognition result of the number to be recognized according to the first abnormal recognition result and the second abnormal recognition result.
For determining the target abnormality recognition result of the number to be recognized according to the first abnormality recognition result and the second abnormality recognition result, several possible embodiments are provided:
In a possible implementation manner, in the case that the first abnormal recognition result and the second abnormal recognition result both represent that the number to be recognized is an abnormal number, the number to be recognized is determined to be the abnormal number.
In another possible implementation manner, in a case that the first abnormal recognition result or the second abnormal recognition result indicates that the number to be recognized is an abnormal number, the number to be recognized is determined to be an abnormal number.
Of course, the target anomaly recognition result of the number to be recognized may be finally obtained based on other embodiments, and in the embodiment of the disclosure, the method is not limited, for example, weights are set for output results of the first recognition model and the second recognition model, respectively, if the first recognition model outputs a probability value that the first anomaly recognition result belongs to normal or anomaly, and the second recognition model outputs a probability value that the second anomaly recognition result belongs to normal or anomaly, is shut down and is not reset, and further comprehensive calculation is performed according to the weights corresponding to the first and second recognition models, and the number category with the highest comprehensive probability is used as the target anomaly recognition result.
That is, in the embodiment of the present disclosure, the first recognition model and the second recognition model may be obtained by training based on different training targets, and then in actual use, the abnormal recognition results of the two models may be integrated to perform final judgment, so as to further improve accuracy, or the abnormal recognition result of one of the models may also be used to perform judgment, so that not only can the recognition accuracy be ensured, but also the coverage rate may be further improved, so as to recognize as many abnormal numbers as possible.
Further, in the embodiment of the present disclosure, in order to improve security, a possible implementation manner is further provided, where, in a case where the number to be identified is an abnormal number, an alarm is given or the number to be identified is turned off, and specifically, a treatment policy for the abnormal number is not limited, for example, a called party corresponding to the abnormal number in a call may be screened, and a reminder is given to the screened called parties.
In the embodiment of the disclosure, the training sample set is obtained according to the call data sample of the number sample accessed to the abnormal broadband account, the target recognition model is obtained after training based on the training sample set and whether the number is an abnormal number and/or whether the number is shut down and not is a machine number to be trained, further, analysis and recognition are carried out based on the target recognition model and the call data of the number to be recognized, and a target abnormal recognition result of the number to be recognized is obtained, so that the training sample set is determined based on the broadband behavior data to carry out the training of the target recognition model, the coverage scene of abnormal recognition can be promoted, and the coverage rate and accuracy of abnormal recognition of the number are further improved based on whether the abnormal number is the training target and whether the number is shut down and not machine number is the training target.
Based on the foregoing embodiments, a possible implementation manner is provided for the training manner of the target recognition model in the embodiments of the present disclosure, and specifically, a training process of the target recognition model in the embodiments of the present disclosure is described below.
In the embodiment of the present disclosure, a possible implementation manner is provided for a training manner of a target recognition model, including:
S1, acquiring a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and un-multiplexed number.
In the embodiment of the disclosure, the training sample set may be divided into a positive training sample set and a negative training sample set, where each positive training sample in the positive training sample set includes a call data sample marked as a normal number sample, and each negative training sample in the negative training sample set includes a call data sample marked as an abnormal number or a number sample marked as a shutdown and not-shutdown, and further, if the number of negative training samples in the training sample set is relatively small, the weight of the negative training samples may be further improved, or various transformed data enhancement processes may be performed on the negative training samples to expand the negative training samples, so that the equality of the positive and negative training samples may be improved, and further, the generalization capability of the target recognition model may be improved.
S2, carrying out feature extraction on call data samples in the training sample set aiming at the training sample set to obtain call feature information, and training a target recognition model according to the call feature information and the corresponding marks of whether the call feature information is an abnormal number or whether the call feature information is a shut-down and non-multiplexed number, so as to obtain a trained target recognition model.
The target recognition model may be trained based on a decision tree algorithm, which is not limited in the embodiment of the present disclosure, and the basic principle of the decision tree algorithm is to induce a set of classification rules from a training sample set, generate readable rules and a decision tree, and further analyze new data using the decision rules.
In this way, in the embodiment of the disclosure, the target recognition model is trained based on whether the training target is an abnormal number or a shut-down number and is not a multiplex number, so that the accuracy and coverage rate of the target recognition model are improved, and the possibility of missed detection is reduced.
Further, in case the object recognition model comprises a first recognition model and a second recognition model, there is correspondingly also provided an embodiment of a possible training process, comprising in particular:
1) A training sample set is obtained, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
2) And carrying out feature extraction on call data samples in the training sample set based on the training sample set to obtain first call feature information, and training a first recognition model according to the first call feature information and the marks corresponding to the abnormal numbers to obtain a trained first recognition model.
3) And carrying out feature extraction on call data samples in the training sample set based on the training sample set to obtain second call feature information, and training a second recognition model according to the second call feature information and the corresponding marks of whether the second call feature information is an abnormal number or whether the second call feature information is a shut-down and non-multiplexed number, so as to obtain a trained second recognition model.
The first recognition model and the second recognition model may be the same or different models of network structures, which is not limited in the embodiment of the present disclosure.
In an embodiment of the present disclosure, for obtaining a training sample set, a possible implementation manner is further provided, where obtaining the training sample set includes: acquiring second broadband behavior data; screening out an abnormal broadband account number for accessing the target abnormal application according to the second broadband behavior data; and obtaining a number sample of the abnormal broadband account number accessed in the first time period, obtaining a call data sample of the number sample in the second time period, and judging whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
For example, the second broadband behavior data is acquired every certain period, such as every day, according to the second broadband behavior data, an abnormal broadband account number of the access target abnormal application of the current day is screened out, then a number sample of the access abnormal broadband account number in a first period, such as 2 months, is screened out, and a call data sample of the number sample in a second period, such as 7 days, is acquired, and whether the number sample is an abnormal number or a mark of a shutdown and no-multiplex number is obtained.
The number sample may be obtained based on a history identification record, or may be obtained from a related authority, which is not limited in the embodiment of the present disclosure.
Further, in the embodiment of the present disclosure, the call data samples may also be preprocessed, for example, operations such as data cleaning, feature extraction, etc., to extract call feature information required for training the model, so as to improve accuracy of model training.
In addition, in the embodiment of the disclosure, based on the call data sample of the acquired number sample and the mark of the number sample, the sample data can be further divided into a training sample set and a test sample set, for example, when the number sample of the abnormal broadband account is acquired for 2 months, the number sample of the previous month can be determined as the training sample set, and the number sample of the next month is determined as the test sample set, so that after the model is trained based on the training sample set, the test is performed based on the test sample set, and the recognition and prediction capability of the model on future numbers can be ensured.
In this way, in the embodiment of the disclosure, the abnormal broadband account number using the target abnormal application can be screened based on the second broadband behavior data, the suspected abnormal number sample is further screened out, the abnormal number using the homemade APP can be effectively screened out, and further model training is performed based on the call data sample of the screened number sample, so that accuracy and coverage rate can be improved.
The following describes a training process by using a specific application scenario, taking an example that the target recognition model includes a first recognition model and a second recognition model, referring to fig. 2, a flowchart of a training method for the target recognition model in an embodiment of the disclosure is shown, where the method includes:
s201: and acquiring second broadband behavior data, and screening out an abnormal broadband account number for accessing the target abnormal application.
S202: and obtaining a number sample of the abnormal broadband account number accessed in the first time period, obtaining a call data sample of the number sample in the second time period, and judging whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
S203: and (5) preprocessing data.
S204: the data is divided into a training sample set and a test sample set.
S205: based on the training sample set, a first recognition model is trained by taking whether the abnormal number is a training target.
S206: based on the training sample set, a second recognition model is trained with whether the abnormal number or the shutdown and no-return number is the training target.
For example, the first recognition model and the second recognition model are both decision tree models, and the first decision tree rule of the first recognition model and the second decision tree rule of the second recognition model can be obtained through training.
In this way, training based on the composite training target can improve the accuracy of the second recognition model and can also reduce the omission ratio.
S207: and obtaining an abnormal recognition result of the training sample set according to the first recognition model and the second recognition model.
S208: and judging whether the recognition accuracy of the abnormal recognition result reaches the target accuracy, if so, executing the step S209, otherwise, returning to continue executing the step S205.
In the embodiment of the disclosure, based on the first recognition model and the second recognition model, the training samples in the training sample set are recognized to generate the abnormal recognition result of the training sample set, and further, the accuracy rate can be evaluated based on the marks corresponding to the training samples, if the target accuracy rate is reached, the trained model can be considered to meet certain requirements, the subsequent test operation can be performed, and if the target accuracy rate is not reached, the first recognition model and the second recognition model need to be trained again.
S209: the first recognition model and the second recognition model are tested based on the test sample set.
S210: and evaluating and analyzing the test result.
For example, judging an abnormal recognition result and a mark corresponding to a test sample in the test result, evaluating the test accuracy rate so as to evaluate the recognition effects of the first recognition model and the second recognition model, further, for example, when a training sample set and the test sample set are divided, the number is accessed to the abnormal broadband account number to be used as the training sample set earlier than the time, and the number is used as the test sample set later than the time, so that the recognition capability of the model to future data can be evaluated when the test is performed based on the test sample set, and the accuracy of the model in subsequent practical use can be further evaluated.
In the embodiment of the disclosure, based on the second broadband behavior data, the abnormal broadband account is screened out, and the number sample of the history access abnormal broadband account is further determined to be a suspected abnormal number, so that the suspected abnormal number can be screened out quickly, the efficiency is improved, further, the call data sample of the number sample in a certain period of time is obtained, training of the model is performed twice based on different training targets, the accuracy and coverage rate of model identification are improved, in addition, the abnormal number identification method in the embodiment of the disclosure can be applied to a fraud number prevention scene, and the specific application scene is not limited.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.
In addition, the disclosure further provides an abnormal number identification device, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any of the abnormal number identification methods provided in the disclosure, and the corresponding technical schemes and descriptions and corresponding descriptions referring to the method parts are not repeated.
Fig. 3 is a block diagram of an abnormal number recognition apparatus according to an embodiment of the present disclosure. Referring to fig. 3, an embodiment of the present disclosure provides an abnormal number recognition apparatus, including:
a first obtaining module 31, configured to obtain a number to be identified;
A second obtaining module 32, configured to obtain call data of the number to be identified within a preset time period;
The identifying module 33 is configured to analyze the call data based on a trained target identifying model, and obtain a target anomaly identifying result of the number to be identified, where the target identifying model is based on a training sample set, and is obtained after training whether the target identifying model is an anomaly number and/or whether the target identifying model is a shutdown and no-call number, and the training sample set is obtained according to a number sample accessing an anomaly broadband account and a call data sample of the number sample, and the anomaly broadband account characterizes a broadband account accessing a target anomaly application.
In a possible embodiment, when the number to be identified is obtained, the first obtaining module 31 is configured to:
Acquiring first broadband behavior data;
screening out a target broadband account number for accessing the target abnormal application according to the first broadband behavior data;
And acquiring the number accessed to the target broadband account, and determining the number to be identified according to the acquired number accessed to the target broadband account.
In a possible embodiment, the target recognition model includes a first recognition model and a second recognition model, and when the call data is analyzed based on the trained target recognition model to obtain the target abnormal recognition result of the number to be recognized, the recognition module 33 is configured to:
Analyzing the call data based on the first recognition model to obtain a first abnormal recognition result of the number to be recognized, wherein the first recognition model is obtained after training by taking whether the number to be recognized is an abnormal number as a training target;
Analyzing the call data based on the second recognition model to obtain a second abnormal recognition result of the number to be recognized, wherein the second recognition model is obtained after training whether the number is an abnormal number or whether the number is turned off and the number which is not reset is a training target;
And determining a target abnormal recognition result of the number to be recognized according to the first abnormal recognition result and the second abnormal recognition result.
In a possible embodiment, when determining the target anomaly identification result of the number to be identified according to the first anomaly identification result and the second anomaly identification result, the identification module 33 is configured to:
Determining that the number to be identified is an abnormal number under the condition that the first abnormal identification result and the second abnormal identification result both represent that the number to be identified is an abnormal number; or alternatively, the first and second heat exchangers may be,
And under the condition that the first abnormal recognition result or the second abnormal recognition result represents that the number to be recognized is an abnormal number, determining that the number to be recognized is an abnormal number.
In a possible embodiment, the training module 34 is further configured to perform the training mode of the target recognition model by using the training module 34:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
And carrying out feature extraction on call data samples in the training sample set aiming at the training sample set to obtain call feature information, and training a target recognition model according to the call feature information and the corresponding mark of whether the call feature information is an abnormal number or whether the call feature information is a shut-down and non-reset number, so as to obtain the trained target recognition model.
In one possible embodiment, training module 34 is further configured to:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
Based on the training sample set, carrying out feature extraction on call data samples in the training sample set to obtain first call feature information, and training a first recognition model according to the first call feature information and the marks corresponding to whether the first call feature information is an abnormal number or not to obtain a trained first recognition model;
And carrying out feature extraction on call data samples in the training sample set based on the training sample set to obtain second call feature information, and training a second recognition model according to the second call feature information and the corresponding mark of whether the second call feature information is an abnormal number or whether the second call feature information is a shut-down and non-reset number, so as to obtain a trained second recognition model.
In a possible embodiment, the training module 34 is configured to, when acquiring the training sample set:
Acquiring second broadband behavior data;
screening out an abnormal broadband account number for accessing the target abnormal application according to the second broadband behavior data;
And obtaining a number sample of the abnormal broadband account accessed in the first time period, obtaining a call data sample of the number sample in the second time period, and judging whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
The above-described respective modules in the abnormal number recognition apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Referring to fig. 4, an embodiment of the present disclosure provides an electronic device including: at least one processor 701; at least one memory 702, and one or more I/O interfaces 703 connected between the processor 701 and the memory 702; wherein the memory 702 stores one or more computer programs executable by the at least one processor 701, the one or more computer programs being executable by the at least one processor 701 to enable the at least one processor 701 to perform the anomaly number recognition method described above.
The various modules in the electronic device described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
The disclosed embodiments also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the above-described abnormal number identification method. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.
The disclosed embodiments also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the above-described abnormal number identification method.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
The computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (10)

1. An abnormal number recognition method, comprising:
Acquiring a number to be identified;
acquiring call data of the number to be identified in a preset time period;
and analyzing the call data based on a trained target recognition model to obtain a target abnormal recognition result of the number to be recognized, wherein the target recognition model is obtained based on a training sample set and by training whether the target recognition model is an abnormal number and/or whether the target recognition model is a shut-down and non-multiplexed number as a training target, the training sample set is obtained according to a number sample of an access abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessing an abnormal application of the target.
2. The method of claim 1, wherein the obtaining the number to be identified comprises:
Acquiring first broadband behavior data;
screening out a target broadband account number for accessing the target abnormal application according to the first broadband behavior data;
And acquiring the number accessed to the target broadband account, and determining the number to be identified according to the acquired number accessed to the target broadband account.
3. The method according to claim 1, wherein the target recognition model includes a first recognition model and a second recognition model, and the analyzing the call data based on the trained target recognition model to obtain the target abnormality recognition result of the number to be recognized includes:
Analyzing the call data based on the first recognition model to obtain a first abnormal recognition result of the number to be recognized, wherein the first recognition model is obtained after training by taking whether the number to be recognized is an abnormal number as a training target;
Analyzing the call data based on the second recognition model to obtain a second abnormal recognition result of the number to be recognized, wherein the second recognition model is obtained after training whether the number is an abnormal number or whether the number is turned off and the number which is not reset is a training target;
And determining a target abnormal recognition result of the number to be recognized according to the first abnormal recognition result and the second abnormal recognition result.
4. A method according to claim 3, wherein said determining the target anomaly identification result of the number to be identified based on the first anomaly identification result and the second anomaly identification result comprises:
Determining that the number to be identified is an abnormal number under the condition that the first abnormal identification result and the second abnormal identification result both represent that the number to be identified is an abnormal number; or alternatively, the first and second heat exchangers may be,
And under the condition that the first abnormal recognition result or the second abnormal recognition result represents that the number to be recognized is an abnormal number, determining that the number to be recognized is an abnormal number.
5. The method according to any one of claims 1-4, wherein the training mode of the object recognition model is:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
And carrying out feature extraction on call data samples in the training sample set aiming at the training sample set to obtain call feature information, and training a target recognition model according to the call feature information and the corresponding mark of whether the call feature information is an abnormal number or whether the call feature information is a shut-down and non-reset number, so as to obtain the trained target recognition model.
6. A method according to claim 3, characterized in that the method further comprises:
Obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample comprises a call data sample of a number sample, and whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number;
Based on the training sample set, carrying out feature extraction on call data samples in the training sample set to obtain first call feature information, and training a first recognition model according to the first call feature information and the marks corresponding to whether the first call feature information is an abnormal number or not to obtain a trained first recognition model;
And carrying out feature extraction on call data samples in the training sample set based on the training sample set to obtain second call feature information, and training a second recognition model according to the second call feature information and the corresponding mark of whether the second call feature information is an abnormal number or whether the second call feature information is a shut-down and non-reset number, so as to obtain a trained second recognition model.
7. The method of claim 5, wherein the acquiring a training sample set comprises:
Acquiring second broadband behavior data;
screening out an abnormal broadband account number for accessing the target abnormal application according to the second broadband behavior data;
And obtaining a number sample of the abnormal broadband account accessed in the first time period, obtaining a call data sample of the number sample in the second time period, and judging whether the number sample is an abnormal number or a mark of a shut-down and non-multiplexed number.
8. An abnormal number recognition apparatus, comprising:
the first acquisition module is used for acquiring the number to be identified;
the second acquisition module is used for acquiring call data of the number to be identified in a preset time period;
The identification module is used for analyzing the call data based on a trained target identification model to obtain a target abnormal identification result of the number to be identified, wherein the target identification model is obtained based on a training sample set and by taking whether the number is an abnormal number and/or whether the number is turned off and not multiplexed as a training target, the training sample set is obtained according to a number sample accessed to an abnormal broadband account and a call data sample of the number sample, and the abnormal broadband account represents a broadband account accessed to a target abnormal application.
9. An electronic device, comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the anomaly number identification method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the anomaly number recognition method according to any one of claims 1 to 7.
CN202410154128.7A 2024-02-02 2024-02-02 Abnormal number identification method and device, electronic equipment and storage medium Pending CN117939003A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410154128.7A CN117939003A (en) 2024-02-02 2024-02-02 Abnormal number identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410154128.7A CN117939003A (en) 2024-02-02 2024-02-02 Abnormal number identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117939003A true CN117939003A (en) 2024-04-26

Family

ID=90770133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410154128.7A Pending CN117939003A (en) 2024-02-02 2024-02-02 Abnormal number identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117939003A (en)

Similar Documents

Publication Publication Date Title
CN109241711B (en) User behavior identification method and device based on prediction model
CN107566358B (en) Risk early warning prompting method, device, medium and equipment
CN110110093A (en) A kind of recognition methods, device, electronic equipment and the storage medium of knowledge based map
US20180255097A1 (en) Method and device for application information risk management
US10861025B2 (en) Systems and methods of photo-based fraud protection
CN108876188B (en) Inter-connected service provider risk assessment method and device
CN107886414B (en) Order combination method and equipment and computer storage medium
CN114840853B (en) Digital business analysis method based on big data and cloud server
CN110827033A (en) Information processing method and device and electronic equipment
CN110362999A (en) Abnormal method and device is used for detecting account
CN112631888A (en) Fault prediction method and device of distributed system, storage medium and electronic equipment
CN112819611A (en) Fraud identification method, device, electronic equipment and computer-readable storage medium
CN111931189A (en) API interface transfer risk detection method and device and API service system
CN112307464A (en) Fraud identification method and device and electronic equipment
CN109815697A (en) Wrong report behavior processing method and processing device
CN116707965A (en) Threat detection method and device, storage medium and electronic equipment
CN111105064B (en) Method and device for determining suspicion information of fraud event
CN114445088A (en) Method and device for judging fraudulent conduct, electronic equipment and storage medium
CN113918949A (en) Recognition method of fraud APP based on multi-mode fusion
CN110727576B (en) Web page testing method, device, equipment and storage medium
CN112613974A (en) Risk early warning method, device, equipment and readable storage medium
CN117939003A (en) Abnormal number identification method and device, electronic equipment and storage medium
CN116610503A (en) Component detection method and device
US20200322331A1 (en) Methods and systems of authenticating of personal communications
CN114944950B (en) Real name authentication method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination