CN110401779A - A kind of method, apparatus and computer readable storage medium identifying telephone number - Google Patents

A kind of method, apparatus and computer readable storage medium identifying telephone number Download PDF

Info

Publication number
CN110401779A
CN110401779A CN201810372550.4A CN201810372550A CN110401779A CN 110401779 A CN110401779 A CN 110401779A CN 201810372550 A CN201810372550 A CN 201810372550A CN 110401779 A CN110401779 A CN 110401779A
Authority
CN
China
Prior art keywords
user
attribute
telephone number
behavioral data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810372550.4A
Other languages
Chinese (zh)
Other versions
CN110401779B (en
Inventor
贺小红
庄仁峰
胡文辉
叶天宽
黄鹤羽
何亚玲
卓彩霞
黄浩
曹阳
潘锦彬
陈德志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Internet Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Internet Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Internet Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201810372550.4A priority Critical patent/CN110401779B/en
Publication of CN110401779A publication Critical patent/CN110401779A/en
Application granted granted Critical
Publication of CN110401779B publication Critical patent/CN110401779B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/66Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
    • H04M1/663Preventing unauthorised calls to a telephone set
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2281Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Technology Law (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the invention discloses a kind of methods for identifying telephone number, this method comprises: obtaining the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;To the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, model training is carried out using machine learning algorithm, obtains harassing call model;Data analysis is carried out to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, obtains attribute selection model;According to harassing call model and attribute selection model, Attribute Recognition is carried out to the telephone number for initiating new call request, obtains its attribute;This method carries out machine learning model training and data analysis to a variety of data respectively, obtain the higher harassing call model of reliability and attribute selection model, Attribute Recognition is carried out to each telephone number with both models again, further improves the recognition accuracy to each telephone number.

Description

A kind of method, apparatus and computer readable storage medium identifying telephone number
Technical field
The present invention relates to mobile communication technology field more particularly to a kind of method, apparatus and calculating for identifying telephone number Machine readable storage medium storing program for executing.
Background technique
Since present many website registrations or outgoing consumption require that user fills in cell-phone telephone number, use in this case A possibility that cell-phone telephone number at family is leaked to some criminals also greatly increases, and almost each user once heard Some ad promotions or the harassing call of swindle, in order to help user to identify harassing call in advance, existing some cloud computings are flat Platform carries out the model of big data processing and machine learning to the call behavioral data (for example, message registration) of all telephone numbers Training obtains the machine learning model by harassing call feature for sorting parameter, according to this machine learning model to any One telephone number is identified;On the other hand, present mobile phone all has mark function substantially, when any one user is in mobile phone On by any one caller phone labeled as after harassing call, when the phone initiates call request to any one phone number again When, the label of harassing call will be shown on the mobile phone interface of called number, thus destination user prompter.
It in the prior art or is the feature that harassing call is extracted based on communication behavior data, by harassing call feature structure Harassing call is identified at a machine learning model or be to be disturbed based on mobile phone user to the identification of the mark information of telephone number Phone is disturbed, still, both prior arts are all based on folk prescription face data, it is more difficult to harassing call is accurately identified, for example, only root Harassing call is identified according to a machine learning model being made of harassing call feature, and the accuracy of harassing call identification takes completely Certainly in the accuracy of the machine learning model, the non-harassing and wrecking attributes number of high frequencies such as phone, express delivery phone and taxi phone will be taken out Code and the situation incidence of high frequency harassing and wrecking Number Reorganization mistake are higher;The mark information identification of telephone number is disturbed according only to user When disturbing phone, there are some users to carry out the case where malice marks, and recognition accuracy is also to need to be further improved.
Summary of the invention
It is a primary object of the present invention to propose a kind of method, apparatus and computer-readable storage medium for identifying telephone number Matter, it is intended to solve in existing telephone number identification method that basis of characterization reliability is not high, reduce telephone number recognition accuracy The problem of.
The technical scheme of the present invention is realized as follows:
The embodiment of the present invention provides a kind of method for identifying telephone number, which comprises
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, Model training is carried out using machine learning algorithm, obtains harassing call model;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data Data analysis is carried out, attribute selection model is obtained, the attribute selection model is used to indicate to determine the mark of the attribute of telephone number It is quasi-;
According to the harassing call model and the attribute selection model, the telephone number for initiating new call request is carried out Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
It is described according to the harassing call model and the attribute selection model in above scheme, it is asked to new calling is initiated The telephone number asked carries out Attribute Recognition, obtains the attribute of each telephone number for initiating new call request, comprising:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, is initiated The type of prediction of the telephone number of new call request;Wherein, the type of prediction is harassing call or non-harassing call;
It is harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model Telephone number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
It is described to all phones in the user's communication behavioral data and the user's communication behavioral data in above scheme The attribute information of number carries out data analysis, obtains attribute selection model, comprising:
The attribute information of all telephone numbers from the user's communication behavioral data and the user's communication behavioral data Middle selection sort parameter establishes attribute selection model according to the sorting parameter.
It is described to all phones in the user's communication behavioral data and the user's communication behavioral data in above scheme The attribute information of number carries out model training using machine learning algorithm, obtains harassing call model, comprising:
Taxonomic revision is carried out to the user's communication behavioral data, obtains each telephone number in user's communication behavioral data Call behavioural characteristic, wherein the call behavioural characteristic includes at least one of the following: that history caller number, history caller are logical Words duration, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, it is logical to obtain user Talk about the attributive character of each telephone number in behavioral data;Wherein, the attributive character is: harassing call, express delivery food phone, Enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine Learning algorithm carries out model training, obtains the harassing call model.
In above scheme, the attribute information to all telephone numbers in the user's communication behavioral data is classified It arranges, obtains the attributive character of each telephone number in user's communication behavioral data, comprising:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavior is obtained N number of attribute undetermined of each telephone number in data;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, it is logical to obtain user Talk about the attributive character of each telephone number in behavioral data.
In above scheme, the attribute information of all telephone numbers includes following at least one in the user's communication behavioral data :
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
In above scheme, the attribute selection model includes at least one of the following: centre phone model, frequent contact Phone model.
It is described according to the harassing call model and the attribute selection model in above scheme, it is asked to new calling is initiated The telephone number asked carries out Attribute Recognition, comprising:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, to initiation The telephone number of new call request carries out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
When the time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset Between threshold value.
The embodiment of the present invention also provides a kind of device for identifying telephone number, and described device includes: memory and processor; Wherein,
The memory, for storing computer program
The processor, for executing following steps when running the computer program:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, Model training is carried out using machine learning algorithm, obtains harassing call model;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data Data analysis is carried out, attribute selection model is obtained, the attribute selection model is used to indicate to determine the mark of the attribute of telephone number It is quasi-;
According to the harassing call model and the attribute selection model, the telephone number for initiating new call request is carried out Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, is initiated The type of prediction of the telephone number of new call request;Wherein, the type of prediction is harassing call or non-harassing call;
It is harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model Telephone number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
The attribute information of all telephone numbers from the user's communication behavioral data and the user's communication behavioral data Middle selection sort parameter establishes attribute selection model according to the sorting parameter.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
Taxonomic revision is carried out to the user's communication behavioral data, obtains each telephone number in user's communication behavioral data Call behavioural characteristic, wherein the call behavioural characteristic includes at least one of the following: that history caller number, history caller are logical Words duration, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, it is logical to obtain user Talk about the attributive character of each telephone number in behavioral data;Wherein, the attributive character is: harassing call, express delivery food phone, Enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine Learning algorithm carries out model training, obtains the harassing call model.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavior is obtained N number of attribute undetermined of each telephone number in data;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, it is logical to obtain user Talk about the attributive character of each telephone number in behavioral data.
In above scheme, the attribute information of all telephone numbers includes following at least one in the user's communication behavioral data :
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
In above scheme, the attribute selection model includes at least one of the following: centre phone model, frequent contact Phone model.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, to initiation The telephone number of new call request carries out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
When the time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset Between threshold value.
The embodiment of the present invention also provides a kind of computer readable storage medium, which is characterized in that described computer-readable to deposit Storage media is stored with computer program,
When the computer program is executed by least one processor, at least one described processor is caused to execute above-mentioned The step of method of any one identification telephone number.
The embodiment of the present invention provides a kind of method for identifying telephone number, obtains user's communication behavioral data and user's communication The attribute information of all telephone numbers in behavioral data;To the user's communication behavioral data and the user's communication behavioral data In all telephone numbers attribute information, using machine learning algorithm carry out model training, obtain harassing call model;To described The attribute information of all telephone numbers carries out data analysis in user's communication behavioral data and the user's communication behavioral data, obtains To attribute selection model, the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;According to the harassing and wrecking Phone model and the attribute selection model carry out Attribute Recognition to the telephone number for initiating new call request, obtain each hair Play the attribute of the telephone number of new call request.In this way, carrying out machine learning calculation respectively to multi-class data in the embodiment of the present invention The model training and data of method are analyzed, and obtain the higher harassing call model of reliability and attribute selection model, then by this two Kind model carries out Attribute Recognition to each telephone number, further improves the recognition accuracy to each telephone number.
Detailed description of the invention
Fig. 1 is a kind of method flow diagram one for identifying telephone number provided in an embodiment of the present invention;
Fig. 2 is a kind of method flow diagram two for identifying telephone number provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of device for identifying telephone number provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of the anti-harassment system of intelligence provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description.
Embodiment one
The embodiment of the present invention provides a kind of method for identifying telephone number, as shown in Figure 1, this method comprises:
Step S101: the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data is obtained Breath.
In actual implementation, user can initiate call request by mobile phone, fixed line, the networking telephone or pseudo-base station, to institute There is the telephone number for initiating call request, its corresponding user's communication behavior number can be obtained from the server of operator According to, comprising: the user's communication behavioral data of all telephone numbers for initiating call request in preset time period is obtained, or Obtain the user's communication behavioral data of all telephone numbers for initiating call request within a preset period of time;Wherein, Mei Gefa It may include at least one of following for playing the user's communication behavioral data of the telephone number of call request: all telephone numbers are exhaled The called phone number that cried, the all-calls telephone numbers calling telephone number, with each telephone number conversed The duration of call and air time, the telephone number of all short messages for receiving the telephone number, the oriented telephone number of institute send short The telephone number of letter is picked up number, is not picked up number etc.;
Again to all telephone numbers in user's communication behavioral data, draw by using the mode of web crawlers, using search It holds up the mode scanned for or retrieves the mode of Relational database, obtain all telephone numbers in user's communication behavioral data Attribute information, attribute information may include at least one of following: user's mark information, corporate authentication information, enterprise's Yellow Page letter Breath, phone black list information, phone white list information;For example, being by user's communication behavioral data in the way of web crawlers In all telephone numbers be put into spiders program, then crawlers can be in www.baidu.com and www.so.com etc. The relevant attribute information of telephone number to be identified is searched in search engine, and more users can also be grabbed from WWW Mark information.
Optionally, to the category of all telephone numbers in the user's communication behavioral data and user's communication behavioral data got Property information, using big data platform carry out distributed storage, for example, be based on Hadoop distributed system, one big data of framework Platform is used to carry out high speed computing and storage to mass data;Further, at can also be using the distribution of big data platform Reason technology executes step S102 to step S104.
Further, the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data is obtained After breath, it can also include: that taxonomic revision is carried out to user's communication behavioral data, obtain each electricity in user's communication behavioral data Talk about the call behavioural characteristic of number, wherein call behavioural characteristic may include at least one of following: history called phone list, History caller number, the history caller duration of call, history are called number, history incoming call duration, are picked up number, are not connect Listen number;It illustratively, can be to obtain user's communication behavior number in user's communication behavioral data on the one from preset time period The call behavioural characteristic of each telephone number in comprising at least one of following: add up called phone number list day, day tires out Meter caller number, adds up opposite-terminal number percentage day, adds up short call percentage, day day at the day accumulative average caller duration of call Accumulative called number, day accumulative called opposite-terminal number number, adds up roaming position variation day at day accumulative average incoming call duration Number etc.;
Taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtains user's communication row For the attributive character of telephone number each in data;Wherein, attributive character may is that harassing call, non-harassing call, fast delivering Meal phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone.
Step S102: the attribute of all telephone numbers in user's communication behavioral data and user's communication behavioral data is believed Breath carries out model training using machine learning algorithm, obtains harassing call model.
In actual implementation, establish harassing call model using all call behavioural characteristics as sorting parameter, can according to The attributive character of each telephone number, telephone number each in user's communication behavioral data is divided into and is disturbed in family call behavioral data Disturb phone or non-harassing call, in conjunction with each telephone number in user's communication behavioral data call behavioural characteristic as defeated Enter, model training is carried out using the supervised learning algorithm in machine learning algorithm, obtains harassing call model.
Step S103: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data Data analysis is carried out, attribute selection model is obtained, attribute selection model is used to indicate to determine the standard of the attribute of telephone number.
In actual implementation, in order to improve the recognition accuracy of telephone number, disturbed except through machine learning algorithm Phone model is disturbed, the institute for being different from other attributive character that can also have according to all telephone numbers of some attributive character There is the characteristic feature of telephone number, the attribute of all telephone numbers from user's communication behavioral data and user's communication behavioral data Selection sort parameter in information is established the corresponding attribute selection model of some attributive character according to the sorting parameter, and then is obtained To including the corresponding attribute selection model of multiple and different attributive character, wherein attribute selection model may include following at least one : centre phone model, frequent contact phone model etc., each attribute selection model can uniquely identify some category All telephone numbers of property feature, are further screened by recognition result of the attribute selection model to harassing call model, thus Obtain more accurate recognition result.
Illustratively, intermediate number is the principle flexibly bound based on virtual countermark, when O2O (Online To Offline, To under line on online offline/line) after trade order generates, O2O platform will be randomly assigned a centre conduct for both parties The telephone number temporarily conversed, the centre number and trade order binding use, and the centre number is unbinded and returned after transaction It receives, guarantees both parties in call only among display number, to implement to the actual telephone number information of both parties effective Encipherment protection;Illustratively, it when telephone number A calling telephone number B, is breathed out by centre C, in telephone number A and B Centre C is only shown in affiliated terminal, telephone number A calling centre C and centre C is called in the call behavioral data of generation Telephone number B is existed simultaneously, that is to say, that intermediate number there is caller number to be equal to called number, the caller duration of call equal to quilt It is the characteristic feature of the duration of call, therefore, can establish centre phone model, sorting parameter is caller number, called time This four number, the caller duration of call and incoming call duration call behavioural characteristics;
Frequent contact is analyzed the call behavioral data of the telephone number of user 1 and user 2, when user's 3 Telephone number exists simultaneously in the call behavioral data of the telephone number of user 1 and user 2, and user 3 is exactly user 1 and user 2 common contacts, also, when the common contacts of user 1 and user 2 are more, then the cohesion of user 1 and user 2 are higher, User 1 and user 2 each other frequent contact a possibility that it is higher;Furthermore it is also possible to be recorded according to the communication of user 1 and user 2 Friend, enterprise's circle (enterprise directory), family's circle (home network) etc., obtain the common contacts of user 1 and user 2;Illustratively, The called phone number list of any two telephone number A, B in user's communication behavioral data are obtained, two telephone numbers A, B's When having i identical called phone numbers in called phone number list, the intimate angle value of telephone number A and B are equal to i, and i is big In or equal to 1 integer, when the intimate angle value of telephone number A and B are greater than default cohesion threshold value, telephone number A and B are two Therefore the telephone number of a frequent contact can establish frequent contact phone model, sorting parameter is called phone number Code list and intimate angle value.
It should be noted that not limited the execution sequence of step S102 and step S103 in the embodiment of the present invention System, for example, step S102 can be executed before step S103, can also execute, the two can also be same after step s 103 Shi Zhihang.
Step S104: according to harassing call model and attribute selection model, to initiate the telephone number of new call request into Row Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
In actual implementation, according to harassing call model, type prediction is carried out to the telephone number for initiating new call request, Obtain initiating the type of prediction of the telephone number of new call request, wherein type of prediction is harassing call or non-harassing call;Phase Ying Di is the telephone number of non-harassing call to type of prediction in the telephone number for initiating new call request, determines that its attribute is Non- harassing call;It is the telephone number of harassing call to type of prediction in the telephone number for initiating new call request, according to attribute Screening model determines its attribute;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to attribute selection model Number carries out Attribute Recognition, obtains the attribute for the telephone number that the type of prediction is harassing call;Illustratively, attribute selection mould Type may include centre model and frequent contact model, when meeting in preset determined property condition at least one, really The fixed type of prediction is that the attribute of the telephone number of harassing call is non-harassing call, and otherwise, which is harassing call Telephone number attribute be harassing call, wherein preset determined property condition include: with centre model to the prediction class Type is that the telephone number of harassing call carries out Attribute Recognition, which is that the telephone number of harassing call is centre phone Number;Attribute Recognition, the type of prediction are carried out to the telephone number that the type of prediction is harassing call with frequent contact model Telephone number for harassing call is frequent contact telephone number.
It should be noted that step S101 to step S104 can be by using the big of distributed treatment and distributed storage Data platform is realized.
It can be seen that obtaining in user's communication behavioral data and user's communication behavioral data and owning in the embodiment of the present invention The attribute information of telephone number;To the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data Breath carries out model training using machine learning algorithm, obtains harassing call model;To user's communication behavioral data and user's communication The attribute information of all telephone numbers carries out data analysis in behavioral data, obtains attribute selection model;According to harassing call mould Type and attribute selection model carry out Attribute Recognition to the telephone number for initiating new call request, obtain the new calling of each initiation and ask The attribute for the telephone number asked;Above-mentioned attribute selection model is to carry out data analysis according to the characteristic feature of some telephone numbers It obtains, each attribute selection model can accurately identify therefore all telephone numbers of some attributive character pass through attribute Screening model carries out further judgement screening to the recognition result of harassing call model, improves and asks to the new calling of each initiation The recognition accuracy for the telephone number asked.
Embodiment two
In order to more embody the purpose of the present invention, on the basis of the above embodiments, progress is further illustrated It is bright.
The embodiment of the present invention provides a kind of method for identifying telephone number, as shown in Fig. 2, this method comprises:
Step S201: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data Taxonomic revision is carried out, the call behavioural characteristic and attributive character of each telephone number in user's communication behavioral data are obtained.
In actual implementation, taxonomic revision can be carried out to user's communication behavioral data, obtain user's communication behavioral data In each telephone number call behavioural characteristic;The attribute information of all telephone numbers in user's communication behavioral data is divided Class arranges, and obtains the attributive character of each telephone number in user's communication behavioral data.
Further, taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtained The attributive character of each telephone number, includes the following steps: in user's communication behavioral data
S2011: according to the attribute information of telephone numbers all in user's communication behavioral data, user's communication behavior number is obtained N number of attribute undetermined of each telephone number in;Wherein, N is the integer more than or equal to 1, and each attribute undetermined, which may is that, disturbs Disturb phone, express delivery food phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact electricity Words.
Illustratively, attribute information may include user's mark information, corporate authentication information, enterprise's yellow page information, phone Black list information, phone white list information, centre information and frequent contact information.
In actual implementation, user's mark information can be user in any one APP (Application, application program) On the comment of any one telephone number is marked, harassing call, the swindleness of each labeled phone are obtained from user's mark information Deceive the attribute undetermined of phone, ad promotions phone or express delivery food phone;Corporate authentication information and enterprise's yellow page information can be What each enterprise added in corporate authentication management system or enterprise's Yellow Page management system includes the enterprise of enterprise telephone number Information obtains the attribute undetermined of the enterprise phone of each enterprise phone from corporate authentication information and enterprise's yellow page information;Phone Black list information can be the telephone number that can not call own number of user setting, obtain from phone blacklist each black The attribute undetermined by phone refusing of list phone;What phone white list information can be user setting can call own number Telephone number obtains the attribute undetermined of each white list phone preferentially to receive calls from phone white list;Intermediate number refers to Caller identification of one temporary phone number as both call sides is set for both call sides, from the APP with centre function Obtain the attribute undetermined of the centre phone of each intervening phone, or any analyzed according to user's communication behavioral data The attribute undetermined of the centre phone of a intervening phone;Frequent contact, which refers to, contacts electricity when phone and another phone When the number of same phone is higher in words list, assert that the two phones are to belong to frequent contact phone, from user's communication row The attribute undetermined of the frequent contact phone of any two phones is obtained for analysis in data.
S2012: N number of attribute undetermined of each telephone number in user's communication behavioral data is screened, user is obtained The attributive character of each telephone number in call behavioral data.
Illustratively, can according to default rule in user's communication behavioral data each telephone number it is N number of undetermined Attribute is screened, for example, according to the confidence level of every attribute information, from high to low to every attribute information progress confidence level Sequence, the sequence of attribute information from high to low may is that user's mark information, corporate authentication information, enterprise's yellow page information, phone Black list information, phone white list information, intermediate number, frequent contact;Then, according to the reliability order of attribute information, from In user's communication behavioral data in N number of attribute undetermined of each telephone number select the highest attribute information of confidence level it is corresponding to Determine attribute;
Attributive character can be divided into harassing call and non-harassing call, wherein being divided into attributive character is harassing call Attribute undetermined have: harassing call, fraudulent call, ad promotions phone and by phone refusing, being divided into attributive character is non-disturb The attribute undetermined for disturbing phone has: enterprise phone preferentially receives calls, centre phone and frequent contact phone;It is then possible to According to the highest attribute information of confidence level of each telephone number in the partitioning standards and user's communication behavioral data it is corresponding to Determine attribute, obtains the attributive character of each telephone number in user's communication behavioral data.
Further, it when being harassing call or non-harassing call by each Attribute transposition undetermined, can also count each The frequency of occurrence of attribute undetermined, when being divided into the frequency of occurrence of either one or two of harassing call attribute undetermined not less than preset When frequency threshold value, just determine that the attribute undetermined is harassing call, otherwise repartitions non-harassing call for the attribute undetermined.
Step S202: carrying out data cleansing to the call behavioural characteristic of all telephone numbers in user's communication behavioral data, The call behavioural characteristic of all telephone numbers in user's communication behavioral data after being cleaned.
In actual implementation, arbitrarily classify to limited user's communication behavioral data, it is logical to obtain many users Behavioural characteristic is talked about, but not all call behavioural characteristic is all conducive to machine learning model training, it is therefore desirable to Data cleansing is carried out to the call behavioural characteristic of acquisition, data cleansing refers to discovery and corrects identifiable mistake in data file Accidentally, including check data consistency, processing invalid value and processing missing values etc..
Illustratively, data cleansing is carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data, It may comprise steps of:
S2021: carrying out invalid column to the call behavioural characteristic of all telephone numbers in user's communication behavioral data and delete, nothing Effect column are deleted mainly to two kinds of data, first is that institute of any type of data in whole historical data in historical data Accounting example is very small, for example, the line number of a few column datas does at deletion this several column data less than 1000 in the data of tens of thousands of rows Reason;Second is that some data unrelated with call behavioural characteristic in historical data.
Illustratively, invalid column are carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data to delete It removes, may include at least one of following:
The Characteristic Number for including in the call behavioural characteristic of any telephone number in user's communication behavioral data is less than default Characteristic Number threshold value, delete the telephone number and the call behavioural characteristic of itself, wherein preset Characteristic Number threshold value can be with It is configured according to the total number of feature different in the call behavioural characteristic of all telephone numbers;
Number comprising the telephone number of any feature in call behavioural characteristic is less than preset number number threshold value, from institute This feature is deleted in the call behavioural characteristic for having telephone number;
To in the call behavioural characteristic of all telephone numbers in user's communication behavioral data "Yes" and "No" delete, Wherein, "Yes" and "No" are to infer to obtain from user's communication behavioral data.
S2022: in the user's communication behavioral data after deleting invalid column the call behavioural characteristic of all telephone numbers into Row processing empty value, since the call behavioural characteristic of each telephone number not necessarily includes all features, any one telephone number Call behavioural characteristic in Partial Feature there is no characteristic value, therefore, can in the telephone number Partial Feature assign 0 value, table Show that there is no corresponding communication behaviors.
S2023: being normalized the call behavioural characteristic of all telephone numbers after processing empty value, due to call The value range of characteristic value in behavioural characteristic there are any feature is excessive, cause on classification results influence it is very big, therefore, can According to the call behavioural characteristic of all telephone numbers after processing empty value, to determine the flat of each feature in call behavioural characteristic Mean value is normalized the characteristic value of this feature when the average value of any feature is greater than default characteristic threshold value, guarantees All characteristic values of this feature are in suitable numberical range, for example, using L2 norm method for normalizing.
Step S203: the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after cleaning is carried out special Sign is extracted, the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after being selected.
In actual implementation, problem is identified in order to learn harassing call out from the call behavioural characteristic after cleaning Structure and essence carry out feature extraction to the call behavioural characteristic after cleaning, and picking out has better solution to harassing call model The feature released, usually, according to following two according to progress feature extraction: first is that whether feature dissipates, if any one is special Sign does not dissipate, and carries out variance calculating to the characteristic value of this feature of all telephone numbers, and the variance yields of this feature is approximately equal to 0, That is all telephone numbers there is no difference in this feature, and therefore, area of the feature not dissipated for telephone number Not what use divided;Second is that the correlation of feature and target, the high feature with harassing call correlation, should preferentially select.
The method of feature extraction includes: feature selecting and dimensionality reduction, and the purpose of the two is try to reduce characteristic concentration Feature number;Wherein, the method for feature selecting is to select subset from initial characteristic data concentration, does not change original spy In the case where levying space, screen fraction feature is carried out from subset, 3 classes: (1) Filter filtration method are broadly divided into, according to diversity Perhaps correlation scores to each feature and sets scoring threshold value or Characteristic Number threshold value to be selected, and selects feature;Mainly Method have: card side verifies (Chi-squared test), information gain (information gain) and related coefficient (correlation coefficient scores);(2) Wrapper pack is generated different combinations, root by several features Prediction effect scoring is carried out to each combination according to objective function, then combines and is compared with other, select every time several features or Exclude several features;Main method has: recursive feature elimination algorithm (recursive feature elimination algorithm);(3) Embedded embedding inlay technique is first trained using the algorithm and model of default machine learning, is obtained each The weight coefficient of feature selects feature according to weight coefficient from big to small;Main method has: regularization;
The method of dimensionality reduction is to combine different features by the relationship between feature and obtain new feature, change original spy Space, the selected section feature from new feature are levied, main method has: Principal Component Analysis (Principal Component Analysis, PCA), linear discriminatory analysis (Linear Discriminant Analysis, LDA), singular value decomposition method (Singular Value Decomposition, SVD), Sammon reflection method (Sammon's Mapping).
It illustratively, can be using following several method successively to the call behavioural characteristic of all telephone numbers after cleaning Carry out feature extraction: (1) card side verifies, and the verification of card side is exactly inclined between the actual observed value of statistical sample and theoretical implications value From degree, the departure degree between actual observed value and theoretical implications value just determines the size of chi-square value, and chi-square value is bigger, corresponding Sample be unfavorable for data classification, delete the sample;Chi-square value is smaller, and corresponding sample is conducive to data classification, retains the sample This;(2) recursive feature is eliminated, and recursive feature elimination is to be trained using a basic mode type to characteristic set, obtains each spy The significance level (for example, weight coefficient) of sign, and most unessential feature is eliminated, then next training in rotation is carried out to new characteristic set Practice, until reaching required feature quantity;(3) based on the feature selecting of tree-model, by GBDT (Gradient in tree-model Boost Decision Tree, iteration decision tree) characteristic set is trained as basic mode type, further according to training result pair Feature is selected;(4) linear discriminatory analysis, linear discriminatory analysis are exactly to seek a linear transformation, are made in sample data not The ratio between covariance matrix reaches maximum between covariance matrix between homogeneous data and each data inside same class data.
Step S204: to the call behavioural characteristic and attribute of all telephone numbers in the user's communication behavioral data after selection Feature carries out model training using machine learning algorithm, the harassing call model after being trained.
Whether in actual implementation, can choose mainly is that harassing call identifies to telephone number, correspondingly, is used Machine learning algorithm obtains a harassing call model, detailed process are as follows: the division proportion that 2 to 8 can be used, after selection The call behavioural characteristic of all telephone numbers is divided into training data and test data in user's communication behavioral data, after selection Call behavioural characteristic in sorting parameter of all features as harassing call model, using training data to harassing call model It is trained, reuses test data and accuracy rate verifying is carried out to the harassing call model after training;Wherein, harassing call model Random forest grader can be used.
Further, when the harassing call model after training is unsatisfactory for default accuracy rate threshold value, to the harassing and wrecking after training Phone model carries out model adjustment, obtains optimal harassing call model, for example, using the method for k folding cross validation, it is sufficiently sharp With the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after selection to the harassing call model after training It is tested, obtains optimal harassing call model.
Specifically, k rolls over the method for cross validation the following steps are included: joining to the classification in the harassing call model after training Several or sorting parameter weight is changed, and obtains the mutually different harassing and wrecking undetermined of weight of m sorting parameter or sorting parameter Phone model, m are the integer more than or equal to 1;By the call of all telephone numbers in the user's communication behavioral data after selection Behavioural characteristic divides data set S for k disjoint subsets as data set S, k be greater than integer;To each harassing and wrecking undetermined Phone model executes following procedure: 1 subset for not repeatedly taking k son to concentrate every time is as test set, other k-1 subset It is used for training pattern as training set, calculates the recognition accuracy of the harassing call model undetermined on test set later, then by k Secondary recognition accuracy is averaged, the true recognition accuracy as the harassing call model undetermined;From m harassing and wrecking electricity undetermined Select true recognition accuracy highest in words model, as optimal harassing call model.
Step S205: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data Data analysis is carried out, attribute selection model is obtained, attribute selection model is used to indicate to determine the standard of the attribute of telephone number.
The implementation of this step is identical as the implementation of step S103, and which is not described herein again.
It should be noted that in the embodiment of the present invention, not to step S202 to the execution of step S204 and step S205 Sequence is limited, for example, step S202 to step S204 can be executed before step S205, can also step S205 it After execute, the two also may be performed simultaneously.
Step S206: according to the harassing call model and attribute selection model after training, to the electricity for initiating new call request It talks about number and carries out Attribute Recognition, obtain the attribute of each telephone number for initiating new call request.
In actual implementation, when meeting preset predicted condition, according to the harassing call model and attribute sieve after training Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, wherein preset predicted condition include it is following at least One: the quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;Current time to last time The time interval of the renewable time of the attribute of telephone number is greater than or equal to preset time threshold.
It should be noted that step S201 to step S206 can be by using the big of distributed treatment and distributed storage Data platform is realized.
It can be seen that in the embodiment of the present invention, to all electricity in user's communication behavioral data and user's communication behavioral data The attribute information for talking about number carries out taxonomic revision, obtains the call behavioural characteristic of each telephone number in user's communication behavioral data And attributive character;Data cleansing and spy are successively carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data Sign is extracted, the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after being selected;To the use after selection The call behavioural characteristic and attributive character of all telephone numbers carry out model training in family call behavioral data, after being trained Harassing call model;The attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data is carried out Data analysis, obtains attribute selection model;According to the harassing call model and attribute selection model after training, to the new calling of initiation The telephone number of request carries out Attribute Recognition, obtains the attribute of each telephone number for initiating new call request;In the above process, Taxonomic revision, data cleansing and feature extraction are carried out to a variety of data, owned in the user's communication behavioral data after being selected The call behavioural characteristic of telephone number is used for model training, to obtain the higher harassing call model of recognition accuracy, then root Data are carried out according to the characteristic feature of some telephone numbers to analyze to obtain attribute selection model, are sieved by harassing call model and attribute Modeling type carries out harassing call identification, improves the recognition accuracy to telephone number.
Embodiment three
In order to more embody the purpose of the present invention, on the basis of preceding method embodiment, further lifted Example explanation.
The embodiment of the present invention provides a kind of device for identifying telephone number, as shown in figure 3, the device of identification telephone number 300 include: memory 301 and processor 302, wherein
Memory 301 is for storing computer program;
When the computer program that processor 302 is used to store in run memory 301, following steps are executed:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, using machine Learning algorithm carries out model training, obtains harassing call model;
Data are carried out to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data Analysis, obtains attribute selection model, and attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to harassing call model and attribute selection model, attribute knowledge is carried out to the telephone number for initiating new call request Not, the recognition property of each telephone number for initiating new call request is obtained.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute Following steps:
According to harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale It is the type of prediction of the telephone number of request, wherein type of prediction is harassing call or non-harassing call;Correspondingly, to initiation Type of prediction is the telephone number of non-harassing call in the telephone number of new call request, determines that its attribute is non-harassing call; It is the telephone number of harassing call to type of prediction in the telephone number for initiating new call request, is determined according to attribute selection model Its attribute;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to attribute selection model Number carries out Attribute Recognition, obtains the attribute for the telephone number that the type of prediction is harassing call;Illustratively, attribute selection mould Type may include centre model and frequent contact model, when meeting in preset determined property condition at least one, really The fixed type of prediction is that the attribute of the telephone number of harassing call is non-harassing call, and otherwise, which is harassing call Telephone number attribute be harassing call, wherein preset determined property condition include: with centre model to the prediction class Type is that the telephone number of harassing call carries out Attribute Recognition, which is that the telephone number of harassing call is centre phone Number;Attribute Recognition, the type of prediction are carried out to the telephone number that the type of prediction is harassing call with frequent contact model Telephone number for harassing call is frequent contact telephone number.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute Following steps: from selection point in the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data Class parameter establishes attribute selection model according to the sorting parameter, wherein attribute selection model may include at least one of following: Centre phone model, frequent contact phone model etc., each attribute selection model can uniquely identify some attribute All telephone numbers of feature.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute Following steps: taxonomic revision is carried out to user's communication behavioral data, obtains each telephone number in user's communication behavioral data Call behavioural characteristic, wherein call behavioural characteristic may include at least one of following: history called phone list, history caller Number, the history caller duration of call, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtains user's communication row For the attributive character of telephone number each in data;Wherein, attributive character may is that harassing call, non-harassing call, fast delivering Meal phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in user's communication behavioral data, using machine learning Algorithm carries out model training, obtains harassing call model.
In above scheme, the attribute information of all telephone numbers is included at least one of the following: in user's communication behavioral data User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information;
Attribute selection model includes at least one of the following: centre phone model, frequent contact phone model.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute Following steps:
When meeting preset predicted condition, according to harassing call model and attribute selection model, asked to new calling is initiated The telephone number asked carries out Attribute Recognition, wherein preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of type of prediction attribute of current time to last telephone number is greater than or equal to Preset time threshold.
Illustratively, identify that the device of telephone number can be the big data platform using distributed structure/architecture, by identification electricity Device, mobile core network and the business platform of words number can form the anti-harassment system of intelligence, the anti-harassment system of intelligence Structural schematic diagram it is as shown in Figure 4, wherein mobile core network may include moving exchanging center MSC (Mobile Switching Center), business monitoring platform SCP (Business Monitoring Platform) and home location register Device HLR (Home Location Register), mobile switching centre are used to receive the call request of telephone number, and to business Monitor supervision platform sends notice signaling;Business monitoring platform to the device of identification telephone number for sending out when receiving notice signaling The harassing call of portable phone number is sent to identify request;
When the device of identification telephone number is used to receive the telephone number identification request of portable phone number, to phone number Code is identified, and recognition result is returned to business monitoring platform.
Further, business monitoring platform is specifically used for, when recognition result is that telephone number belongs to harassing call, and is called When called terminal belonging to telephone number opens interception service, notice mobile switching centre stops the calling of the telephone number, and Business platform is sent by interception state, so that business platform sends short massage notice called terminal and intercepts result;Work as recognition result Belong to harassing call for calling number, and when called terminal belonging to called phone phone opens reminding business, notifies mobile hand over Switching center9 gives the calling of the telephone number and lets pass, and can notify that the called terminal telephone number is by way of twinkle SM Harassing call;When recognition result is that telephone number belongs to non-harassing call, notify mobile switching centre to the telephone number Calling, which is given, lets pass.
Example IV
Based on technical concept identical with previous embodiment, the embodiment of the present invention five provides a kind of computer-readable storage Medium can be applied in device;The portion that the technical solution of previous embodiment substantially in other words contributes to the prior art Divide or all or part of the technical solution can be embodied in the form of software products, computer software product storage In a computer readable storage medium, including some instructions are used so that a computer equipment (can be individual calculus Machine, server or network equipment etc.) or processor (processor) execute the present embodiment the method all or part Step.And storage medium above-mentioned include: USB flash disk, it is mobile hard disk, read-only memory (ROM, Read Only Memory), random Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic or disk Matter.
Specifically, the corresponding computer program instructions of method of one of the present embodiment identification telephone number, can be with It is stored in CD, hard disk, on the storage mediums such as USB flash disk, when corresponding with a kind of identification method of telephone number in storage medium Computer program instructions read or be performed by an electronic equipment, before causing at least one described processor to execute the present invention State step described in the method for any one identification telephone number of embodiment.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal (can be mobile phone, computer, service Device, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form, all of these belong to the protection of the present invention.

Claims (17)

1. a kind of method for identifying telephone number, which is characterized in that the described method includes:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, use Machine learning algorithm carries out model training, obtains harassing call model;
The attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data is carried out Data analysis, obtains attribute selection model, and the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to the harassing call model and the attribute selection model, attribute is carried out to the telephone number for initiating new call request Identification obtains the attribute of each telephone number for initiating new call request.
2. the method according to claim 1, wherein described sieve according to the harassing call model and the attribute Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, obtains each phone number for initiating new call request The attribute of code, comprising:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale It is the type of prediction of the telephone number of request;Wherein, the type of prediction is harassing call or non-harassing call;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model Number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
3. the method according to claim 1, wherein described to the user's communication behavioral data and the user The attribute information of all telephone numbers carries out data analysis in call behavioral data, obtains attribute selection model, comprising:
From being selected in the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data Sorting parameter is selected, attribute selection model is established according to the sorting parameter.
4. the method according to claim 1, wherein described to the user's communication behavioral data and the user The attribute information of all telephone numbers in call behavioral data, carries out model training using machine learning algorithm, obtains harassing and wrecking electricity Talk about model, comprising:
Taxonomic revision is carried out to the user's communication behavioral data, obtains the logical of each telephone number in user's communication behavioral data Talk about behavioural characteristic, wherein when the call behavioural characteristic includes at least one of the following: history caller number, history caller call Long, history is called number, history incoming call duration, is picked up number, is not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, obtains user's communication row For the attributive character of telephone number each in data;Wherein, the attributive character is: harassing call, express delivery food phone, enterprise Phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine learning Algorithm carries out model training, obtains the harassing call model.
5. according to the method described in claim 4, it is characterized in that, described to all phones in the user's communication behavioral data The attribute information of number carries out taxonomic revision, obtains the attributive character of each telephone number in user's communication behavioral data, comprising:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavioral data is obtained In each telephone number N number of attribute undetermined;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, user's communication row is obtained For the attributive character of telephone number each in data.
6. the method according to claim 1, wherein all telephone numbers in the user's communication behavioral data Attribute information includes at least one of the following:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
7. the method according to claim 1, wherein during the attribute selection model includes at least one of the following: Between number phone model, frequent contact phone model.
8. the method according to claim 1, wherein described sieve according to the harassing call model and the attribute Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, comprising:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, initiation is newly exhaled The telephone number of request is made to carry out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset time threshold Value.
9. a kind of device for identifying telephone number, which is characterized in that described device includes: memory and processor;Wherein,
The memory, for storing computer program
The processor, for executing following steps when running the computer program:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, use Machine learning algorithm carries out model training, obtains harassing call model;
The attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data is carried out Data analysis, obtains attribute selection model, and the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to the harassing call model and the attribute selection model, attribute is carried out to the telephone number for initiating new call request Identification obtains the attribute of each telephone number for initiating new call request.
10. device according to claim 9, which is characterized in that the processor is specifically used for running the computer When program, following steps are executed:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale It is the type of prediction of the telephone number of request;Wherein, the type of prediction is harassing call or non-harassing call;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model Number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
11. device according to claim 9, which is characterized in that the processor is specifically used for running the computer When program, following steps are executed:
From being selected in the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data Sorting parameter is selected, attribute selection model is established according to the sorting parameter.
12. device according to claim 9, which is characterized in that the processor is specifically used for running the computer When program, following steps are executed:
Taxonomic revision is carried out to the user's communication behavioral data, obtains the logical of each telephone number in user's communication behavioral data Talk about behavioural characteristic, wherein when the call behavioural characteristic includes at least one of the following: history caller number, history caller call Long, history is called number, history incoming call duration, is picked up number, is not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, obtains user's communication row For the attributive character of telephone number each in data;Wherein, the attributive character is: harassing call, express delivery food phone, enterprise Phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine learning Algorithm carries out model training, obtains the harassing call model.
13. device according to claim 12, which is characterized in that the processor is specifically used for running the computer When program, following steps are executed:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavioral data is obtained In each telephone number N number of attribute undetermined;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, user's communication row is obtained For the attributive character of telephone number each in data.
14. device according to claim 9, which is characterized in that all telephone numbers in the user's communication behavioral data Attribute information include at least one of the following:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
15. device according to claim 9, which is characterized in that during the attribute selection model includes at least one of the following: Between number phone model, frequent contact phone model.
16. device according to claim 9, which is characterized in that the processor is specifically used for running the computer When program, following steps are executed:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, initiation is newly exhaled The telephone number of request is made to carry out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset time threshold Value.
17. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey Sequence,
When the computer program is executed by least one processor, lead at least one processor perform claim requirement The step of 1 to 8 described in any item methods.
CN201810372550.4A 2018-04-24 2018-04-24 Method and device for identifying telephone number and computer readable storage medium Active CN110401779B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810372550.4A CN110401779B (en) 2018-04-24 2018-04-24 Method and device for identifying telephone number and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810372550.4A CN110401779B (en) 2018-04-24 2018-04-24 Method and device for identifying telephone number and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110401779A true CN110401779A (en) 2019-11-01
CN110401779B CN110401779B (en) 2022-02-01

Family

ID=68320143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810372550.4A Active CN110401779B (en) 2018-04-24 2018-04-24 Method and device for identifying telephone number and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110401779B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111144336A (en) * 2019-12-30 2020-05-12 贵州近邻宝科技有限公司 Automatic identification method for mobile phone number and invoice number of addressee facing to express bill
CN111432078A (en) * 2020-03-27 2020-07-17 中国—东盟信息港股份有限公司 System for judging code number abnormity
CN111465021A (en) * 2020-04-01 2020-07-28 北京中亦安图科技股份有限公司 Graph-based crank call identification model construction method
CN112261654A (en) * 2020-09-23 2021-01-22 中国地质大学(武汉) Method and system for generating mobile phone number white list in telecommunication anti-fraud process
CN112417311A (en) * 2020-10-29 2021-02-26 上海淇玥信息技术有限公司 Method and device for executing service based on influence factor and electronic equipment
CN113037699A (en) * 2019-12-25 2021-06-25 中国电信股份有限公司 Communication interception method, apparatus and computer readable storage medium
CN113286035A (en) * 2021-05-14 2021-08-20 国家计算机网络与信息安全管理中心 Abnormal call detection method, device, equipment and medium
CN113301210A (en) * 2021-04-16 2021-08-24 珠海高凌信息科技股份有限公司 Method and device for preventing harassing call based on neural network and electronic equipment
CN113315874A (en) * 2020-02-26 2021-08-27 卡巴斯基实验室股份制公司 System and method for call classification
CN113452845A (en) * 2020-03-26 2021-09-28 中国移动通信集团福建有限公司 Method and electronic equipment for identifying abnormal telephone number
CN113905134A (en) * 2021-10-21 2022-01-07 中国联合网络通信集团有限公司 Address list blacklist management method, system, equipment and medium based on block chain
CN113935758A (en) * 2020-07-14 2022-01-14 中国移动通信集团广东有限公司 Training method and device of random forest model for predicting handling probability of broadband service
CN113992798A (en) * 2021-10-26 2022-01-28 中国联合网络通信集团有限公司 Telephone identification method, device, equipment and readable storage medium
CN114125155A (en) * 2021-11-15 2022-03-01 天津市国瑞数码安全系统股份有限公司 Crank call detection method and system based on big data analysis
CN114189585A (en) * 2020-09-14 2022-03-15 中国移动通信集团重庆有限公司 Crank call abnormity detection method and device and computing equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104023109A (en) * 2014-06-27 2014-09-03 深圳市中兴移动通信有限公司 Incoming call prompt method and device as well as incoming call classifying method and device
CN106686261A (en) * 2017-01-19 2017-05-17 腾讯科技(深圳)有限公司 Information processing method and system
CN107273531A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Telephone number classifying identification method, device, equipment and storage medium
CN107306306A (en) * 2016-04-25 2017-10-31 腾讯科技(深圳)有限公司 Communicating number processing method and processing device
CN107331385A (en) * 2017-07-07 2017-11-07 重庆邮电大学 A kind of identification of harassing call and hold-up interception method
CN107517463A (en) * 2016-06-15 2017-12-26 中国移动通信集团浙江有限公司 A kind of recognition methods of telephone number and device
CN107835496A (en) * 2017-11-24 2018-03-23 北京奇虎科技有限公司 A kind of recognition methods of refuse messages, device and server

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104023109A (en) * 2014-06-27 2014-09-03 深圳市中兴移动通信有限公司 Incoming call prompt method and device as well as incoming call classifying method and device
CN107306306A (en) * 2016-04-25 2017-10-31 腾讯科技(深圳)有限公司 Communicating number processing method and processing device
CN107517463A (en) * 2016-06-15 2017-12-26 中国移动通信集团浙江有限公司 A kind of recognition methods of telephone number and device
CN106686261A (en) * 2017-01-19 2017-05-17 腾讯科技(深圳)有限公司 Information processing method and system
CN107273531A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Telephone number classifying identification method, device, equipment and storage medium
CN107331385A (en) * 2017-07-07 2017-11-07 重庆邮电大学 A kind of identification of harassing call and hold-up interception method
CN107835496A (en) * 2017-11-24 2018-03-23 北京奇虎科技有限公司 A kind of recognition methods of refuse messages, device and server

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113037699B (en) * 2019-12-25 2022-11-29 中国电信股份有限公司 Communication interception method, device and computer readable storage medium
CN113037699A (en) * 2019-12-25 2021-06-25 中国电信股份有限公司 Communication interception method, apparatus and computer readable storage medium
CN111144336A (en) * 2019-12-30 2020-05-12 贵州近邻宝科技有限公司 Automatic identification method for mobile phone number and invoice number of addressee facing to express bill
CN113315874A (en) * 2020-02-26 2021-08-27 卡巴斯基实验室股份制公司 System and method for call classification
CN113315874B (en) * 2020-02-26 2024-03-19 卡巴斯基实验室股份制公司 System and method for call classification
CN113452845B (en) * 2020-03-26 2024-03-19 中国移动通信集团福建有限公司 Method for identifying abnormal telephone number and electronic equipment
CN113452845A (en) * 2020-03-26 2021-09-28 中国移动通信集团福建有限公司 Method and electronic equipment for identifying abnormal telephone number
CN111432078B (en) * 2020-03-27 2021-09-10 中国—东盟信息港股份有限公司 System for judging code number abnormity
CN111432078A (en) * 2020-03-27 2020-07-17 中国—东盟信息港股份有限公司 System for judging code number abnormity
CN111465021A (en) * 2020-04-01 2020-07-28 北京中亦安图科技股份有限公司 Graph-based crank call identification model construction method
CN111465021B (en) * 2020-04-01 2023-06-09 北京中亦安图科技股份有限公司 Graph-based crank call identification model construction method
CN113935758A (en) * 2020-07-14 2022-01-14 中国移动通信集团广东有限公司 Training method and device of random forest model for predicting handling probability of broadband service
CN114189585A (en) * 2020-09-14 2022-03-15 中国移动通信集团重庆有限公司 Crank call abnormity detection method and device and computing equipment
CN112261654B (en) * 2020-09-23 2021-08-03 中国地质大学(武汉) Method and system for generating mobile phone number white list in telecommunication anti-fraud process
CN112261654A (en) * 2020-09-23 2021-01-22 中国地质大学(武汉) Method and system for generating mobile phone number white list in telecommunication anti-fraud process
CN112417311A (en) * 2020-10-29 2021-02-26 上海淇玥信息技术有限公司 Method and device for executing service based on influence factor and electronic equipment
CN113301210B (en) * 2021-04-16 2023-05-23 珠海高凌信息科技股份有限公司 Method and device for preventing harassment call based on neural network and electronic equipment
CN113301210A (en) * 2021-04-16 2021-08-24 珠海高凌信息科技股份有限公司 Method and device for preventing harassing call based on neural network and electronic equipment
CN113286035B (en) * 2021-05-14 2022-12-30 国家计算机网络与信息安全管理中心 Abnormal call detection method, device, equipment and medium
CN113286035A (en) * 2021-05-14 2021-08-20 国家计算机网络与信息安全管理中心 Abnormal call detection method, device, equipment and medium
CN113905134A (en) * 2021-10-21 2022-01-07 中国联合网络通信集团有限公司 Address list blacklist management method, system, equipment and medium based on block chain
CN113992798A (en) * 2021-10-26 2022-01-28 中国联合网络通信集团有限公司 Telephone identification method, device, equipment and readable storage medium
CN114125155A (en) * 2021-11-15 2022-03-01 天津市国瑞数码安全系统股份有限公司 Crank call detection method and system based on big data analysis

Also Published As

Publication number Publication date
CN110401779B (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN110401779A (en) A kind of method, apparatus and computer readable storage medium identifying telephone number
Umayaparvathi et al. A survey on customer churn prediction in telecom industry: Datasets, methods and metrics
US9503571B2 (en) Systems, methods, and media for determining fraud patterns and creating fraud behavioral models
CN109600752A (en) A kind of method and apparatus of depth cluster swindle detection
CN107248082B (en) Card maintenance identification method and device
CN109688275A (en) Harassing call recognition methods, device and storage medium
CN108462785B (en) Method and device for processing malicious call
CN107517463A (en) A kind of recognition methods of telephone number and device
CN108449482A (en) The method and system of Number Reorganization
CN108093405A (en) A kind of fraudulent call number analysis method and apparatus
CN110533085A (en) With people's recognition methods and device, storage medium, computer equipment
CN110519264A (en) Method, device and equipment for tracing attack event
CN107368856A (en) Clustering method and device, the computer installation and readable storage medium storing program for executing of Malware
CN108629379A (en) A kind of individual's reference appraisal procedure and system
CN107153584A (en) Method for detecting abnormality and device
CN110020099A (en) A kind of the user's recommended method and device of video friend-making
CN110611655B (en) Blacklist screening method and related product
KR20170006158A (en) System and method for detecting fraud usage of message
CN113923048A (en) Network attack behavior identification method, device, equipment and storage medium
CN106162586A (en) Method for limiting incoming call, device and system
CN108777749A (en) A kind of fraudulent call recognition methods and device
CN113065748A (en) Business risk assessment method, device, equipment and storage medium
CN111898931A (en) Variable-based strategic wind control engine implementation method and device and computer equipment
CN109587248A (en) User identification method, device, server and storage medium
CN110210884A (en) Determine the method, apparatus, computer equipment and storage medium of user characteristic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant