CN110401779A - A kind of method, apparatus and computer readable storage medium identifying telephone number - Google Patents
A kind of method, apparatus and computer readable storage medium identifying telephone number Download PDFInfo
- Publication number
- CN110401779A CN110401779A CN201810372550.4A CN201810372550A CN110401779A CN 110401779 A CN110401779 A CN 110401779A CN 201810372550 A CN201810372550 A CN 201810372550A CN 110401779 A CN110401779 A CN 110401779A
- Authority
- CN
- China
- Prior art keywords
- user
- attribute
- telephone number
- behavioral data
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/66—Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
- H04M1/663—Preventing unauthorised calls to a telephone set
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Arrangements for supervision, monitoring or testing
- H04M3/2281—Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/12—Detection or prevention of fraud
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Signal Processing (AREA)
- Technology Law (AREA)
- Computer Networks & Wireless Communication (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the invention discloses a kind of methods for identifying telephone number, this method comprises: obtaining the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;To the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, model training is carried out using machine learning algorithm, obtains harassing call model;Data analysis is carried out to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, obtains attribute selection model;According to harassing call model and attribute selection model, Attribute Recognition is carried out to the telephone number for initiating new call request, obtains its attribute;This method carries out machine learning model training and data analysis to a variety of data respectively, obtain the higher harassing call model of reliability and attribute selection model, Attribute Recognition is carried out to each telephone number with both models again, further improves the recognition accuracy to each telephone number.
Description
Technical field
The present invention relates to mobile communication technology field more particularly to a kind of method, apparatus and calculating for identifying telephone number
Machine readable storage medium storing program for executing.
Background technique
Since present many website registrations or outgoing consumption require that user fills in cell-phone telephone number, use in this case
A possibility that cell-phone telephone number at family is leaked to some criminals also greatly increases, and almost each user once heard
Some ad promotions or the harassing call of swindle, in order to help user to identify harassing call in advance, existing some cloud computings are flat
Platform carries out the model of big data processing and machine learning to the call behavioral data (for example, message registration) of all telephone numbers
Training obtains the machine learning model by harassing call feature for sorting parameter, according to this machine learning model to any
One telephone number is identified;On the other hand, present mobile phone all has mark function substantially, when any one user is in mobile phone
On by any one caller phone labeled as after harassing call, when the phone initiates call request to any one phone number again
When, the label of harassing call will be shown on the mobile phone interface of called number, thus destination user prompter.
It in the prior art or is the feature that harassing call is extracted based on communication behavior data, by harassing call feature structure
Harassing call is identified at a machine learning model or be to be disturbed based on mobile phone user to the identification of the mark information of telephone number
Phone is disturbed, still, both prior arts are all based on folk prescription face data, it is more difficult to harassing call is accurately identified, for example, only root
Harassing call is identified according to a machine learning model being made of harassing call feature, and the accuracy of harassing call identification takes completely
Certainly in the accuracy of the machine learning model, the non-harassing and wrecking attributes number of high frequencies such as phone, express delivery phone and taxi phone will be taken out
Code and the situation incidence of high frequency harassing and wrecking Number Reorganization mistake are higher;The mark information identification of telephone number is disturbed according only to user
When disturbing phone, there are some users to carry out the case where malice marks, and recognition accuracy is also to need to be further improved.
Summary of the invention
It is a primary object of the present invention to propose a kind of method, apparatus and computer-readable storage medium for identifying telephone number
Matter, it is intended to solve in existing telephone number identification method that basis of characterization reliability is not high, reduce telephone number recognition accuracy
The problem of.
The technical scheme of the present invention is realized as follows:
The embodiment of the present invention provides a kind of method for identifying telephone number, which comprises
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data,
Model training is carried out using machine learning algorithm, obtains harassing call model;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data
Data analysis is carried out, attribute selection model is obtained, the attribute selection model is used to indicate to determine the mark of the attribute of telephone number
It is quasi-;
According to the harassing call model and the attribute selection model, the telephone number for initiating new call request is carried out
Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
It is described according to the harassing call model and the attribute selection model in above scheme, it is asked to new calling is initiated
The telephone number asked carries out Attribute Recognition, obtains the attribute of each telephone number for initiating new call request, comprising:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, is initiated
The type of prediction of the telephone number of new call request;Wherein, the type of prediction is harassing call or non-harassing call;
It is harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model
Telephone number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
It is described to all phones in the user's communication behavioral data and the user's communication behavioral data in above scheme
The attribute information of number carries out data analysis, obtains attribute selection model, comprising:
The attribute information of all telephone numbers from the user's communication behavioral data and the user's communication behavioral data
Middle selection sort parameter establishes attribute selection model according to the sorting parameter.
It is described to all phones in the user's communication behavioral data and the user's communication behavioral data in above scheme
The attribute information of number carries out model training using machine learning algorithm, obtains harassing call model, comprising:
Taxonomic revision is carried out to the user's communication behavioral data, obtains each telephone number in user's communication behavioral data
Call behavioural characteristic, wherein the call behavioural characteristic includes at least one of the following: that history caller number, history caller are logical
Words duration, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, it is logical to obtain user
Talk about the attributive character of each telephone number in behavioral data;Wherein, the attributive character is: harassing call, express delivery food phone,
Enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine
Learning algorithm carries out model training, obtains the harassing call model.
In above scheme, the attribute information to all telephone numbers in the user's communication behavioral data is classified
It arranges, obtains the attributive character of each telephone number in user's communication behavioral data, comprising:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavior is obtained
N number of attribute undetermined of each telephone number in data;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, it is logical to obtain user
Talk about the attributive character of each telephone number in behavioral data.
In above scheme, the attribute information of all telephone numbers includes following at least one in the user's communication behavioral data
:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
In above scheme, the attribute selection model includes at least one of the following: centre phone model, frequent contact
Phone model.
It is described according to the harassing call model and the attribute selection model in above scheme, it is asked to new calling is initiated
The telephone number asked carries out Attribute Recognition, comprising:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, to initiation
The telephone number of new call request carries out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
When the time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset
Between threshold value.
The embodiment of the present invention also provides a kind of device for identifying telephone number, and described device includes: memory and processor;
Wherein,
The memory, for storing computer program
The processor, for executing following steps when running the computer program:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data,
Model training is carried out using machine learning algorithm, obtains harassing call model;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data
Data analysis is carried out, attribute selection model is obtained, the attribute selection model is used to indicate to determine the mark of the attribute of telephone number
It is quasi-;
According to the harassing call model and the attribute selection model, the telephone number for initiating new call request is carried out
Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, is initiated
The type of prediction of the telephone number of new call request;Wherein, the type of prediction is harassing call or non-harassing call;
It is harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model
Telephone number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
The attribute information of all telephone numbers from the user's communication behavioral data and the user's communication behavioral data
Middle selection sort parameter establishes attribute selection model according to the sorting parameter.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
Taxonomic revision is carried out to the user's communication behavioral data, obtains each telephone number in user's communication behavioral data
Call behavioural characteristic, wherein the call behavioural characteristic includes at least one of the following: that history caller number, history caller are logical
Words duration, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, it is logical to obtain user
Talk about the attributive character of each telephone number in behavioral data;Wherein, the attributive character is: harassing call, express delivery food phone,
Enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine
Learning algorithm carries out model training, obtains the harassing call model.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavior is obtained
N number of attribute undetermined of each telephone number in data;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, it is logical to obtain user
Talk about the attributive character of each telephone number in behavioral data.
In above scheme, the attribute information of all telephone numbers includes following at least one in the user's communication behavioral data
:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
In above scheme, the attribute selection model includes at least one of the following: centre phone model, frequent contact
Phone model.
In above scheme, the processor is specifically used for when running the computer program, executes following steps:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, to initiation
The telephone number of new call request carries out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
When the time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset
Between threshold value.
The embodiment of the present invention also provides a kind of computer readable storage medium, which is characterized in that described computer-readable to deposit
Storage media is stored with computer program,
When the computer program is executed by least one processor, at least one described processor is caused to execute above-mentioned
The step of method of any one identification telephone number.
The embodiment of the present invention provides a kind of method for identifying telephone number, obtains user's communication behavioral data and user's communication
The attribute information of all telephone numbers in behavioral data;To the user's communication behavioral data and the user's communication behavioral data
In all telephone numbers attribute information, using machine learning algorithm carry out model training, obtain harassing call model;To described
The attribute information of all telephone numbers carries out data analysis in user's communication behavioral data and the user's communication behavioral data, obtains
To attribute selection model, the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;According to the harassing and wrecking
Phone model and the attribute selection model carry out Attribute Recognition to the telephone number for initiating new call request, obtain each hair
Play the attribute of the telephone number of new call request.In this way, carrying out machine learning calculation respectively to multi-class data in the embodiment of the present invention
The model training and data of method are analyzed, and obtain the higher harassing call model of reliability and attribute selection model, then by this two
Kind model carries out Attribute Recognition to each telephone number, further improves the recognition accuracy to each telephone number.
Detailed description of the invention
Fig. 1 is a kind of method flow diagram one for identifying telephone number provided in an embodiment of the present invention;
Fig. 2 is a kind of method flow diagram two for identifying telephone number provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of device for identifying telephone number provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of the anti-harassment system of intelligence provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description.
Embodiment one
The embodiment of the present invention provides a kind of method for identifying telephone number, as shown in Figure 1, this method comprises:
Step S101: the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data is obtained
Breath.
In actual implementation, user can initiate call request by mobile phone, fixed line, the networking telephone or pseudo-base station, to institute
There is the telephone number for initiating call request, its corresponding user's communication behavior number can be obtained from the server of operator
According to, comprising: the user's communication behavioral data of all telephone numbers for initiating call request in preset time period is obtained, or
Obtain the user's communication behavioral data of all telephone numbers for initiating call request within a preset period of time;Wherein, Mei Gefa
It may include at least one of following for playing the user's communication behavioral data of the telephone number of call request: all telephone numbers are exhaled
The called phone number that cried, the all-calls telephone numbers calling telephone number, with each telephone number conversed
The duration of call and air time, the telephone number of all short messages for receiving the telephone number, the oriented telephone number of institute send short
The telephone number of letter is picked up number, is not picked up number etc.;
Again to all telephone numbers in user's communication behavioral data, draw by using the mode of web crawlers, using search
It holds up the mode scanned for or retrieves the mode of Relational database, obtain all telephone numbers in user's communication behavioral data
Attribute information, attribute information may include at least one of following: user's mark information, corporate authentication information, enterprise's Yellow Page letter
Breath, phone black list information, phone white list information;For example, being by user's communication behavioral data in the way of web crawlers
In all telephone numbers be put into spiders program, then crawlers can be in www.baidu.com and www.so.com etc.
The relevant attribute information of telephone number to be identified is searched in search engine, and more users can also be grabbed from WWW
Mark information.
Optionally, to the category of all telephone numbers in the user's communication behavioral data and user's communication behavioral data got
Property information, using big data platform carry out distributed storage, for example, be based on Hadoop distributed system, one big data of framework
Platform is used to carry out high speed computing and storage to mass data;Further, at can also be using the distribution of big data platform
Reason technology executes step S102 to step S104.
Further, the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data is obtained
After breath, it can also include: that taxonomic revision is carried out to user's communication behavioral data, obtain each electricity in user's communication behavioral data
Talk about the call behavioural characteristic of number, wherein call behavioural characteristic may include at least one of following: history called phone list,
History caller number, the history caller duration of call, history are called number, history incoming call duration, are picked up number, are not connect
Listen number;It illustratively, can be to obtain user's communication behavior number in user's communication behavioral data on the one from preset time period
The call behavioural characteristic of each telephone number in comprising at least one of following: add up called phone number list day, day tires out
Meter caller number, adds up opposite-terminal number percentage day, adds up short call percentage, day day at the day accumulative average caller duration of call
Accumulative called number, day accumulative called opposite-terminal number number, adds up roaming position variation day at day accumulative average incoming call duration
Number etc.;
Taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtains user's communication row
For the attributive character of telephone number each in data;Wherein, attributive character may is that harassing call, non-harassing call, fast delivering
Meal phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone.
Step S102: the attribute of all telephone numbers in user's communication behavioral data and user's communication behavioral data is believed
Breath carries out model training using machine learning algorithm, obtains harassing call model.
In actual implementation, establish harassing call model using all call behavioural characteristics as sorting parameter, can according to
The attributive character of each telephone number, telephone number each in user's communication behavioral data is divided into and is disturbed in family call behavioral data
Disturb phone or non-harassing call, in conjunction with each telephone number in user's communication behavioral data call behavioural characteristic as defeated
Enter, model training is carried out using the supervised learning algorithm in machine learning algorithm, obtains harassing call model.
Step S103: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Data analysis is carried out, attribute selection model is obtained, attribute selection model is used to indicate to determine the standard of the attribute of telephone number.
In actual implementation, in order to improve the recognition accuracy of telephone number, disturbed except through machine learning algorithm
Phone model is disturbed, the institute for being different from other attributive character that can also have according to all telephone numbers of some attributive character
There is the characteristic feature of telephone number, the attribute of all telephone numbers from user's communication behavioral data and user's communication behavioral data
Selection sort parameter in information is established the corresponding attribute selection model of some attributive character according to the sorting parameter, and then is obtained
To including the corresponding attribute selection model of multiple and different attributive character, wherein attribute selection model may include following at least one
: centre phone model, frequent contact phone model etc., each attribute selection model can uniquely identify some category
All telephone numbers of property feature, are further screened by recognition result of the attribute selection model to harassing call model, thus
Obtain more accurate recognition result.
Illustratively, intermediate number is the principle flexibly bound based on virtual countermark, when O2O (Online To Offline,
To under line on online offline/line) after trade order generates, O2O platform will be randomly assigned a centre conduct for both parties
The telephone number temporarily conversed, the centre number and trade order binding use, and the centre number is unbinded and returned after transaction
It receives, guarantees both parties in call only among display number, to implement to the actual telephone number information of both parties effective
Encipherment protection;Illustratively, it when telephone number A calling telephone number B, is breathed out by centre C, in telephone number A and B
Centre C is only shown in affiliated terminal, telephone number A calling centre C and centre C is called in the call behavioral data of generation
Telephone number B is existed simultaneously, that is to say, that intermediate number there is caller number to be equal to called number, the caller duration of call equal to quilt
It is the characteristic feature of the duration of call, therefore, can establish centre phone model, sorting parameter is caller number, called time
This four number, the caller duration of call and incoming call duration call behavioural characteristics;
Frequent contact is analyzed the call behavioral data of the telephone number of user 1 and user 2, when user's 3
Telephone number exists simultaneously in the call behavioral data of the telephone number of user 1 and user 2, and user 3 is exactly user 1 and user
2 common contacts, also, when the common contacts of user 1 and user 2 are more, then the cohesion of user 1 and user 2 are higher,
User 1 and user 2 each other frequent contact a possibility that it is higher;Furthermore it is also possible to be recorded according to the communication of user 1 and user 2
Friend, enterprise's circle (enterprise directory), family's circle (home network) etc., obtain the common contacts of user 1 and user 2;Illustratively,
The called phone number list of any two telephone number A, B in user's communication behavioral data are obtained, two telephone numbers A, B's
When having i identical called phone numbers in called phone number list, the intimate angle value of telephone number A and B are equal to i, and i is big
In or equal to 1 integer, when the intimate angle value of telephone number A and B are greater than default cohesion threshold value, telephone number A and B are two
Therefore the telephone number of a frequent contact can establish frequent contact phone model, sorting parameter is called phone number
Code list and intimate angle value.
It should be noted that not limited the execution sequence of step S102 and step S103 in the embodiment of the present invention
System, for example, step S102 can be executed before step S103, can also execute, the two can also be same after step s 103
Shi Zhihang.
Step S104: according to harassing call model and attribute selection model, to initiate the telephone number of new call request into
Row Attribute Recognition obtains the attribute of each telephone number for initiating new call request.
In actual implementation, according to harassing call model, type prediction is carried out to the telephone number for initiating new call request,
Obtain initiating the type of prediction of the telephone number of new call request, wherein type of prediction is harassing call or non-harassing call;Phase
Ying Di is the telephone number of non-harassing call to type of prediction in the telephone number for initiating new call request, determines that its attribute is
Non- harassing call;It is the telephone number of harassing call to type of prediction in the telephone number for initiating new call request, according to attribute
Screening model determines its attribute;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to attribute selection model
Number carries out Attribute Recognition, obtains the attribute for the telephone number that the type of prediction is harassing call;Illustratively, attribute selection mould
Type may include centre model and frequent contact model, when meeting in preset determined property condition at least one, really
The fixed type of prediction is that the attribute of the telephone number of harassing call is non-harassing call, and otherwise, which is harassing call
Telephone number attribute be harassing call, wherein preset determined property condition include: with centre model to the prediction class
Type is that the telephone number of harassing call carries out Attribute Recognition, which is that the telephone number of harassing call is centre phone
Number;Attribute Recognition, the type of prediction are carried out to the telephone number that the type of prediction is harassing call with frequent contact model
Telephone number for harassing call is frequent contact telephone number.
It should be noted that step S101 to step S104 can be by using the big of distributed treatment and distributed storage
Data platform is realized.
It can be seen that obtaining in user's communication behavioral data and user's communication behavioral data and owning in the embodiment of the present invention
The attribute information of telephone number;To the attribute letter of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Breath carries out model training using machine learning algorithm, obtains harassing call model;To user's communication behavioral data and user's communication
The attribute information of all telephone numbers carries out data analysis in behavioral data, obtains attribute selection model;According to harassing call mould
Type and attribute selection model carry out Attribute Recognition to the telephone number for initiating new call request, obtain the new calling of each initiation and ask
The attribute for the telephone number asked;Above-mentioned attribute selection model is to carry out data analysis according to the characteristic feature of some telephone numbers
It obtains, each attribute selection model can accurately identify therefore all telephone numbers of some attributive character pass through attribute
Screening model carries out further judgement screening to the recognition result of harassing call model, improves and asks to the new calling of each initiation
The recognition accuracy for the telephone number asked.
Embodiment two
In order to more embody the purpose of the present invention, on the basis of the above embodiments, progress is further illustrated
It is bright.
The embodiment of the present invention provides a kind of method for identifying telephone number, as shown in Fig. 2, this method comprises:
Step S201: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Taxonomic revision is carried out, the call behavioural characteristic and attributive character of each telephone number in user's communication behavioral data are obtained.
In actual implementation, taxonomic revision can be carried out to user's communication behavioral data, obtain user's communication behavioral data
In each telephone number call behavioural characteristic;The attribute information of all telephone numbers in user's communication behavioral data is divided
Class arranges, and obtains the attributive character of each telephone number in user's communication behavioral data.
Further, taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtained
The attributive character of each telephone number, includes the following steps: in user's communication behavioral data
S2011: according to the attribute information of telephone numbers all in user's communication behavioral data, user's communication behavior number is obtained
N number of attribute undetermined of each telephone number in;Wherein, N is the integer more than or equal to 1, and each attribute undetermined, which may is that, disturbs
Disturb phone, express delivery food phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact electricity
Words.
Illustratively, attribute information may include user's mark information, corporate authentication information, enterprise's yellow page information, phone
Black list information, phone white list information, centre information and frequent contact information.
In actual implementation, user's mark information can be user in any one APP (Application, application program)
On the comment of any one telephone number is marked, harassing call, the swindleness of each labeled phone are obtained from user's mark information
Deceive the attribute undetermined of phone, ad promotions phone or express delivery food phone;Corporate authentication information and enterprise's yellow page information can be
What each enterprise added in corporate authentication management system or enterprise's Yellow Page management system includes the enterprise of enterprise telephone number
Information obtains the attribute undetermined of the enterprise phone of each enterprise phone from corporate authentication information and enterprise's yellow page information;Phone
Black list information can be the telephone number that can not call own number of user setting, obtain from phone blacklist each black
The attribute undetermined by phone refusing of list phone;What phone white list information can be user setting can call own number
Telephone number obtains the attribute undetermined of each white list phone preferentially to receive calls from phone white list;Intermediate number refers to
Caller identification of one temporary phone number as both call sides is set for both call sides, from the APP with centre function
Obtain the attribute undetermined of the centre phone of each intervening phone, or any analyzed according to user's communication behavioral data
The attribute undetermined of the centre phone of a intervening phone;Frequent contact, which refers to, contacts electricity when phone and another phone
When the number of same phone is higher in words list, assert that the two phones are to belong to frequent contact phone, from user's communication row
The attribute undetermined of the frequent contact phone of any two phones is obtained for analysis in data.
S2012: N number of attribute undetermined of each telephone number in user's communication behavioral data is screened, user is obtained
The attributive character of each telephone number in call behavioral data.
Illustratively, can according to default rule in user's communication behavioral data each telephone number it is N number of undetermined
Attribute is screened, for example, according to the confidence level of every attribute information, from high to low to every attribute information progress confidence level
Sequence, the sequence of attribute information from high to low may is that user's mark information, corporate authentication information, enterprise's yellow page information, phone
Black list information, phone white list information, intermediate number, frequent contact;Then, according to the reliability order of attribute information, from
In user's communication behavioral data in N number of attribute undetermined of each telephone number select the highest attribute information of confidence level it is corresponding to
Determine attribute;
Attributive character can be divided into harassing call and non-harassing call, wherein being divided into attributive character is harassing call
Attribute undetermined have: harassing call, fraudulent call, ad promotions phone and by phone refusing, being divided into attributive character is non-disturb
The attribute undetermined for disturbing phone has: enterprise phone preferentially receives calls, centre phone and frequent contact phone;It is then possible to
According to the highest attribute information of confidence level of each telephone number in the partitioning standards and user's communication behavioral data it is corresponding to
Determine attribute, obtains the attributive character of each telephone number in user's communication behavioral data.
Further, it when being harassing call or non-harassing call by each Attribute transposition undetermined, can also count each
The frequency of occurrence of attribute undetermined, when being divided into the frequency of occurrence of either one or two of harassing call attribute undetermined not less than preset
When frequency threshold value, just determine that the attribute undetermined is harassing call, otherwise repartitions non-harassing call for the attribute undetermined.
Step S202: carrying out data cleansing to the call behavioural characteristic of all telephone numbers in user's communication behavioral data,
The call behavioural characteristic of all telephone numbers in user's communication behavioral data after being cleaned.
In actual implementation, arbitrarily classify to limited user's communication behavioral data, it is logical to obtain many users
Behavioural characteristic is talked about, but not all call behavioural characteristic is all conducive to machine learning model training, it is therefore desirable to
Data cleansing is carried out to the call behavioural characteristic of acquisition, data cleansing refers to discovery and corrects identifiable mistake in data file
Accidentally, including check data consistency, processing invalid value and processing missing values etc..
Illustratively, data cleansing is carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data,
It may comprise steps of:
S2021: carrying out invalid column to the call behavioural characteristic of all telephone numbers in user's communication behavioral data and delete, nothing
Effect column are deleted mainly to two kinds of data, first is that institute of any type of data in whole historical data in historical data
Accounting example is very small, for example, the line number of a few column datas does at deletion this several column data less than 1000 in the data of tens of thousands of rows
Reason;Second is that some data unrelated with call behavioural characteristic in historical data.
Illustratively, invalid column are carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data to delete
It removes, may include at least one of following:
The Characteristic Number for including in the call behavioural characteristic of any telephone number in user's communication behavioral data is less than default
Characteristic Number threshold value, delete the telephone number and the call behavioural characteristic of itself, wherein preset Characteristic Number threshold value can be with
It is configured according to the total number of feature different in the call behavioural characteristic of all telephone numbers;
Number comprising the telephone number of any feature in call behavioural characteristic is less than preset number number threshold value, from institute
This feature is deleted in the call behavioural characteristic for having telephone number;
To in the call behavioural characteristic of all telephone numbers in user's communication behavioral data "Yes" and "No" delete,
Wherein, "Yes" and "No" are to infer to obtain from user's communication behavioral data.
S2022: in the user's communication behavioral data after deleting invalid column the call behavioural characteristic of all telephone numbers into
Row processing empty value, since the call behavioural characteristic of each telephone number not necessarily includes all features, any one telephone number
Call behavioural characteristic in Partial Feature there is no characteristic value, therefore, can in the telephone number Partial Feature assign 0 value, table
Show that there is no corresponding communication behaviors.
S2023: being normalized the call behavioural characteristic of all telephone numbers after processing empty value, due to call
The value range of characteristic value in behavioural characteristic there are any feature is excessive, cause on classification results influence it is very big, therefore, can
According to the call behavioural characteristic of all telephone numbers after processing empty value, to determine the flat of each feature in call behavioural characteristic
Mean value is normalized the characteristic value of this feature when the average value of any feature is greater than default characteristic threshold value, guarantees
All characteristic values of this feature are in suitable numberical range, for example, using L2 norm method for normalizing.
Step S203: the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after cleaning is carried out special
Sign is extracted, the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after being selected.
In actual implementation, problem is identified in order to learn harassing call out from the call behavioural characteristic after cleaning
Structure and essence carry out feature extraction to the call behavioural characteristic after cleaning, and picking out has better solution to harassing call model
The feature released, usually, according to following two according to progress feature extraction: first is that whether feature dissipates, if any one is special
Sign does not dissipate, and carries out variance calculating to the characteristic value of this feature of all telephone numbers, and the variance yields of this feature is approximately equal to 0,
That is all telephone numbers there is no difference in this feature, and therefore, area of the feature not dissipated for telephone number
Not what use divided;Second is that the correlation of feature and target, the high feature with harassing call correlation, should preferentially select.
The method of feature extraction includes: feature selecting and dimensionality reduction, and the purpose of the two is try to reduce characteristic concentration
Feature number;Wherein, the method for feature selecting is to select subset from initial characteristic data concentration, does not change original spy
In the case where levying space, screen fraction feature is carried out from subset, 3 classes: (1) Filter filtration method are broadly divided into, according to diversity
Perhaps correlation scores to each feature and sets scoring threshold value or Characteristic Number threshold value to be selected, and selects feature;Mainly
Method have: card side verifies (Chi-squared test), information gain (information gain) and related coefficient
(correlation coefficient scores);(2) Wrapper pack is generated different combinations, root by several features
Prediction effect scoring is carried out to each combination according to objective function, then combines and is compared with other, select every time several features or
Exclude several features;Main method has: recursive feature elimination algorithm (recursive feature elimination
algorithm);(3) Embedded embedding inlay technique is first trained using the algorithm and model of default machine learning, is obtained each
The weight coefficient of feature selects feature according to weight coefficient from big to small;Main method has: regularization;
The method of dimensionality reduction is to combine different features by the relationship between feature and obtain new feature, change original spy
Space, the selected section feature from new feature are levied, main method has: Principal Component Analysis (Principal Component
Analysis, PCA), linear discriminatory analysis (Linear Discriminant Analysis, LDA), singular value decomposition method
(Singular Value Decomposition, SVD), Sammon reflection method (Sammon's Mapping).
It illustratively, can be using following several method successively to the call behavioural characteristic of all telephone numbers after cleaning
Carry out feature extraction: (1) card side verifies, and the verification of card side is exactly inclined between the actual observed value of statistical sample and theoretical implications value
From degree, the departure degree between actual observed value and theoretical implications value just determines the size of chi-square value, and chi-square value is bigger, corresponding
Sample be unfavorable for data classification, delete the sample;Chi-square value is smaller, and corresponding sample is conducive to data classification, retains the sample
This;(2) recursive feature is eliminated, and recursive feature elimination is to be trained using a basic mode type to characteristic set, obtains each spy
The significance level (for example, weight coefficient) of sign, and most unessential feature is eliminated, then next training in rotation is carried out to new characteristic set
Practice, until reaching required feature quantity;(3) based on the feature selecting of tree-model, by GBDT (Gradient in tree-model
Boost Decision Tree, iteration decision tree) characteristic set is trained as basic mode type, further according to training result pair
Feature is selected;(4) linear discriminatory analysis, linear discriminatory analysis are exactly to seek a linear transformation, are made in sample data not
The ratio between covariance matrix reaches maximum between covariance matrix between homogeneous data and each data inside same class data.
Step S204: to the call behavioural characteristic and attribute of all telephone numbers in the user's communication behavioral data after selection
Feature carries out model training using machine learning algorithm, the harassing call model after being trained.
Whether in actual implementation, can choose mainly is that harassing call identifies to telephone number, correspondingly, is used
Machine learning algorithm obtains a harassing call model, detailed process are as follows: the division proportion that 2 to 8 can be used, after selection
The call behavioural characteristic of all telephone numbers is divided into training data and test data in user's communication behavioral data, after selection
Call behavioural characteristic in sorting parameter of all features as harassing call model, using training data to harassing call model
It is trained, reuses test data and accuracy rate verifying is carried out to the harassing call model after training;Wherein, harassing call model
Random forest grader can be used.
Further, when the harassing call model after training is unsatisfactory for default accuracy rate threshold value, to the harassing and wrecking after training
Phone model carries out model adjustment, obtains optimal harassing call model, for example, using the method for k folding cross validation, it is sufficiently sharp
With the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after selection to the harassing call model after training
It is tested, obtains optimal harassing call model.
Specifically, k rolls over the method for cross validation the following steps are included: joining to the classification in the harassing call model after training
Several or sorting parameter weight is changed, and obtains the mutually different harassing and wrecking undetermined of weight of m sorting parameter or sorting parameter
Phone model, m are the integer more than or equal to 1;By the call of all telephone numbers in the user's communication behavioral data after selection
Behavioural characteristic divides data set S for k disjoint subsets as data set S, k be greater than integer;To each harassing and wrecking undetermined
Phone model executes following procedure: 1 subset for not repeatedly taking k son to concentrate every time is as test set, other k-1 subset
It is used for training pattern as training set, calculates the recognition accuracy of the harassing call model undetermined on test set later, then by k
Secondary recognition accuracy is averaged, the true recognition accuracy as the harassing call model undetermined;From m harassing and wrecking electricity undetermined
Select true recognition accuracy highest in words model, as optimal harassing call model.
Step S205: to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Data analysis is carried out, attribute selection model is obtained, attribute selection model is used to indicate to determine the standard of the attribute of telephone number.
The implementation of this step is identical as the implementation of step S103, and which is not described herein again.
It should be noted that in the embodiment of the present invention, not to step S202 to the execution of step S204 and step S205
Sequence is limited, for example, step S202 to step S204 can be executed before step S205, can also step S205 it
After execute, the two also may be performed simultaneously.
Step S206: according to the harassing call model and attribute selection model after training, to the electricity for initiating new call request
It talks about number and carries out Attribute Recognition, obtain the attribute of each telephone number for initiating new call request.
In actual implementation, when meeting preset predicted condition, according to the harassing call model and attribute sieve after training
Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, wherein preset predicted condition include it is following at least
One: the quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;Current time to last time
The time interval of the renewable time of the attribute of telephone number is greater than or equal to preset time threshold.
It should be noted that step S201 to step S206 can be by using the big of distributed treatment and distributed storage
Data platform is realized.
It can be seen that in the embodiment of the present invention, to all electricity in user's communication behavioral data and user's communication behavioral data
The attribute information for talking about number carries out taxonomic revision, obtains the call behavioural characteristic of each telephone number in user's communication behavioral data
And attributive character;Data cleansing and spy are successively carried out to the call behavioural characteristic of all telephone numbers in user's communication behavioral data
Sign is extracted, the call behavioural characteristic of all telephone numbers in the user's communication behavioral data after being selected;To the use after selection
The call behavioural characteristic and attributive character of all telephone numbers carry out model training in family call behavioral data, after being trained
Harassing call model;The attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data is carried out
Data analysis, obtains attribute selection model;According to the harassing call model and attribute selection model after training, to the new calling of initiation
The telephone number of request carries out Attribute Recognition, obtains the attribute of each telephone number for initiating new call request;In the above process,
Taxonomic revision, data cleansing and feature extraction are carried out to a variety of data, owned in the user's communication behavioral data after being selected
The call behavioural characteristic of telephone number is used for model training, to obtain the higher harassing call model of recognition accuracy, then root
Data are carried out according to the characteristic feature of some telephone numbers to analyze to obtain attribute selection model, are sieved by harassing call model and attribute
Modeling type carries out harassing call identification, improves the recognition accuracy to telephone number.
Embodiment three
In order to more embody the purpose of the present invention, on the basis of preceding method embodiment, further lifted
Example explanation.
The embodiment of the present invention provides a kind of device for identifying telephone number, as shown in figure 3, the device of identification telephone number
300 include: memory 301 and processor 302, wherein
Memory 301 is for storing computer program;
When the computer program that processor 302 is used to store in run memory 301, following steps are executed:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data, using machine
Learning algorithm carries out model training, obtains harassing call model;
Data are carried out to the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Analysis, obtains attribute selection model, and attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to harassing call model and attribute selection model, attribute knowledge is carried out to the telephone number for initiating new call request
Not, the recognition property of each telephone number for initiating new call request is obtained.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute
Following steps:
According to harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale
It is the type of prediction of the telephone number of request, wherein type of prediction is harassing call or non-harassing call;Correspondingly, to initiation
Type of prediction is the telephone number of non-harassing call in the telephone number of new call request, determines that its attribute is non-harassing call;
It is the telephone number of harassing call to type of prediction in the telephone number for initiating new call request, is determined according to attribute selection model
Its attribute;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to attribute selection model
Number carries out Attribute Recognition, obtains the attribute for the telephone number that the type of prediction is harassing call;Illustratively, attribute selection mould
Type may include centre model and frequent contact model, when meeting in preset determined property condition at least one, really
The fixed type of prediction is that the attribute of the telephone number of harassing call is non-harassing call, and otherwise, which is harassing call
Telephone number attribute be harassing call, wherein preset determined property condition include: with centre model to the prediction class
Type is that the telephone number of harassing call carries out Attribute Recognition, which is that the telephone number of harassing call is centre phone
Number;Attribute Recognition, the type of prediction are carried out to the telephone number that the type of prediction is harassing call with frequent contact model
Telephone number for harassing call is frequent contact telephone number.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute
Following steps: from selection point in the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data
Class parameter establishes attribute selection model according to the sorting parameter, wherein attribute selection model may include at least one of following:
Centre phone model, frequent contact phone model etc., each attribute selection model can uniquely identify some attribute
All telephone numbers of feature.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute
Following steps: taxonomic revision is carried out to user's communication behavioral data, obtains each telephone number in user's communication behavioral data
Call behavioural characteristic, wherein call behavioural characteristic may include at least one of following: history called phone list, history caller
Number, the history caller duration of call, history are called number, history incoming call duration, are picked up number, are not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in user's communication behavioral data, obtains user's communication row
For the attributive character of telephone number each in data;Wherein, attributive character may is that harassing call, non-harassing call, fast delivering
Meal phone, enterprise phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in user's communication behavioral data, using machine learning
Algorithm carries out model training, obtains harassing call model.
In above scheme, the attribute information of all telephone numbers is included at least one of the following: in user's communication behavioral data
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information;
Attribute selection model includes at least one of the following: centre phone model, frequent contact phone model.
In above scheme, when processor 302 is specifically used for the computer program stored in run memory 301, execute
Following steps:
When meeting preset predicted condition, according to harassing call model and attribute selection model, asked to new calling is initiated
The telephone number asked carries out Attribute Recognition, wherein preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of type of prediction attribute of current time to last telephone number is greater than or equal to
Preset time threshold.
Illustratively, identify that the device of telephone number can be the big data platform using distributed structure/architecture, by identification electricity
Device, mobile core network and the business platform of words number can form the anti-harassment system of intelligence, the anti-harassment system of intelligence
Structural schematic diagram it is as shown in Figure 4, wherein mobile core network may include moving exchanging center MSC (Mobile
Switching Center), business monitoring platform SCP (Business Monitoring Platform) and home location register
Device HLR (Home Location Register), mobile switching centre are used to receive the call request of telephone number, and to business
Monitor supervision platform sends notice signaling;Business monitoring platform to the device of identification telephone number for sending out when receiving notice signaling
The harassing call of portable phone number is sent to identify request;
When the device of identification telephone number is used to receive the telephone number identification request of portable phone number, to phone number
Code is identified, and recognition result is returned to business monitoring platform.
Further, business monitoring platform is specifically used for, when recognition result is that telephone number belongs to harassing call, and is called
When called terminal belonging to telephone number opens interception service, notice mobile switching centre stops the calling of the telephone number, and
Business platform is sent by interception state, so that business platform sends short massage notice called terminal and intercepts result;Work as recognition result
Belong to harassing call for calling number, and when called terminal belonging to called phone phone opens reminding business, notifies mobile hand over
Switching center9 gives the calling of the telephone number and lets pass, and can notify that the called terminal telephone number is by way of twinkle SM
Harassing call;When recognition result is that telephone number belongs to non-harassing call, notify mobile switching centre to the telephone number
Calling, which is given, lets pass.
Example IV
Based on technical concept identical with previous embodiment, the embodiment of the present invention five provides a kind of computer-readable storage
Medium can be applied in device;The portion that the technical solution of previous embodiment substantially in other words contributes to the prior art
Divide or all or part of the technical solution can be embodied in the form of software products, computer software product storage
In a computer readable storage medium, including some instructions are used so that a computer equipment (can be individual calculus
Machine, server or network equipment etc.) or processor (processor) execute the present embodiment the method all or part
Step.And storage medium above-mentioned include: USB flash disk, it is mobile hard disk, read-only memory (ROM, Read Only Memory), random
Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic or disk
Matter.
Specifically, the corresponding computer program instructions of method of one of the present embodiment identification telephone number, can be with
It is stored in CD, hard disk, on the storage mediums such as USB flash disk, when corresponding with a kind of identification method of telephone number in storage medium
Computer program instructions read or be performed by an electronic equipment, before causing at least one described processor to execute the present invention
State step described in the method for any one identification telephone number of embodiment.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in a storage medium
In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal (can be mobile phone, computer, service
Device, air conditioner or network equipment etc.) execute method described in each embodiment of the present invention.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific
Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art
Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much
Form, all of these belong to the protection of the present invention.
Claims (17)
1. a kind of method for identifying telephone number, which is characterized in that the described method includes:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, use
Machine learning algorithm carries out model training, obtains harassing call model;
The attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data is carried out
Data analysis, obtains attribute selection model, and the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to the harassing call model and the attribute selection model, attribute is carried out to the telephone number for initiating new call request
Identification obtains the attribute of each telephone number for initiating new call request.
2. the method according to claim 1, wherein described sieve according to the harassing call model and the attribute
Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, obtains each phone number for initiating new call request
The attribute of code, comprising:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale
It is the type of prediction of the telephone number of request;Wherein, the type of prediction is harassing call or non-harassing call;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model
Number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
3. the method according to claim 1, wherein described to the user's communication behavioral data and the user
The attribute information of all telephone numbers carries out data analysis in call behavioral data, obtains attribute selection model, comprising:
From being selected in the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data
Sorting parameter is selected, attribute selection model is established according to the sorting parameter.
4. the method according to claim 1, wherein described to the user's communication behavioral data and the user
The attribute information of all telephone numbers in call behavioral data, carries out model training using machine learning algorithm, obtains harassing and wrecking electricity
Talk about model, comprising:
Taxonomic revision is carried out to the user's communication behavioral data, obtains the logical of each telephone number in user's communication behavioral data
Talk about behavioural characteristic, wherein when the call behavioural characteristic includes at least one of the following: history caller number, history caller call
Long, history is called number, history incoming call duration, is picked up number, is not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, obtains user's communication row
For the attributive character of telephone number each in data;Wherein, the attributive character is: harassing call, express delivery food phone, enterprise
Phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine learning
Algorithm carries out model training, obtains the harassing call model.
5. according to the method described in claim 4, it is characterized in that, described to all phones in the user's communication behavioral data
The attribute information of number carries out taxonomic revision, obtains the attributive character of each telephone number in user's communication behavioral data, comprising:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavioral data is obtained
In each telephone number N number of attribute undetermined;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, user's communication row is obtained
For the attributive character of telephone number each in data.
6. the method according to claim 1, wherein all telephone numbers in the user's communication behavioral data
Attribute information includes at least one of the following:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
7. the method according to claim 1, wherein during the attribute selection model includes at least one of the following:
Between number phone model, frequent contact phone model.
8. the method according to claim 1, wherein described sieve according to the harassing call model and the attribute
Modeling type carries out Attribute Recognition to the telephone number for initiating new call request, comprising:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, initiation is newly exhaled
The telephone number of request is made to carry out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset time threshold
Value.
9. a kind of device for identifying telephone number, which is characterized in that described device includes: memory and processor;Wherein,
The memory, for storing computer program
The processor, for executing following steps when running the computer program:
Obtain the attribute information of all telephone numbers in user's communication behavioral data and user's communication behavioral data;
To the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data, use
Machine learning algorithm carries out model training, obtains harassing call model;
The attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data is carried out
Data analysis, obtains attribute selection model, and the attribute selection model is used to indicate to determine the standard of the attribute of telephone number;
According to the harassing call model and the attribute selection model, attribute is carried out to the telephone number for initiating new call request
Identification obtains the attribute of each telephone number for initiating new call request.
10. device according to claim 9, which is characterized in that the processor is specifically used for running the computer
When program, following steps are executed:
According to the harassing call model, type prediction is carried out to the telephone number for initiating new call request, obtains initiating newly to exhale
It is the type of prediction of the telephone number of request;Wherein, the type of prediction is harassing call or non-harassing call;
It is the phone of harassing call to type of prediction in the telephone number for initiating new call request according to the attribute selection model
Number carries out Attribute Recognition, obtains the attribute for the telephone number that type of prediction is harassing call.
11. device according to claim 9, which is characterized in that the processor is specifically used for running the computer
When program, following steps are executed:
From being selected in the attribute information of all telephone numbers in the user's communication behavioral data and the user's communication behavioral data
Sorting parameter is selected, attribute selection model is established according to the sorting parameter.
12. device according to claim 9, which is characterized in that the processor is specifically used for running the computer
When program, following steps are executed:
Taxonomic revision is carried out to the user's communication behavioral data, obtains the logical of each telephone number in user's communication behavioral data
Talk about behavioural characteristic, wherein when the call behavioural characteristic includes at least one of the following: history caller number, history caller call
Long, history is called number, history incoming call duration, is picked up number, is not picked up number;
Taxonomic revision is carried out to the attribute information of all telephone numbers in the user's communication behavioral data, obtains user's communication row
For the attributive character of telephone number each in data;Wherein, the attributive character is: harassing call, express delivery food phone, enterprise
Phone, by phone refusing, preferentially receive calls, centre phone or frequent contact phone;
To the call behavioural characteristic and attributive character of each telephone number in the user's communication behavioral data, using machine learning
Algorithm carries out model training, obtains the harassing call model.
13. device according to claim 12, which is characterized in that the processor is specifically used for running the computer
When program, following steps are executed:
According to the attribute information of all telephone numbers in the user's communication behavioral data, the user's communication behavioral data is obtained
In each telephone number N number of attribute undetermined;Wherein, N is the integer more than or equal to 1;
N number of attribute undetermined of each telephone number in the user's communication behavioral data is screened, user's communication row is obtained
For the attributive character of telephone number each in data.
14. device according to claim 9, which is characterized in that all telephone numbers in the user's communication behavioral data
Attribute information include at least one of the following:
User's mark information, corporate authentication information, enterprise's yellow page information, phone black list information, phone white list information.
15. device according to claim 9, which is characterized in that during the attribute selection model includes at least one of the following:
Between number phone model, frequent contact phone model.
16. device according to claim 9, which is characterized in that the processor is specifically used for running the computer
When program, following steps are executed:
When meeting preset predicted condition, according to the harassing call model and the attribute selection model, initiation is newly exhaled
The telephone number of request is made to carry out Attribute Recognition, wherein the preset predicted condition includes at least one of the following:
The quantity for initiating the telephone number of new call request is greater than or equal to preset amount threshold;
The time interval of renewable time of attribute of current time to last telephone number is greater than or equal to preset time threshold
Value.
17. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey
Sequence,
When the computer program is executed by least one processor, lead at least one processor perform claim requirement
The step of 1 to 8 described in any item methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810372550.4A CN110401779B (en) | 2018-04-24 | 2018-04-24 | Method and device for identifying telephone number and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810372550.4A CN110401779B (en) | 2018-04-24 | 2018-04-24 | Method and device for identifying telephone number and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110401779A true CN110401779A (en) | 2019-11-01 |
CN110401779B CN110401779B (en) | 2022-02-01 |
Family
ID=68320143
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810372550.4A Active CN110401779B (en) | 2018-04-24 | 2018-04-24 | Method and device for identifying telephone number and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110401779B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111144336A (en) * | 2019-12-30 | 2020-05-12 | 贵州近邻宝科技有限公司 | Automatic identification method for mobile phone number and invoice number of addressee facing to express bill |
CN111432078A (en) * | 2020-03-27 | 2020-07-17 | 中国—东盟信息港股份有限公司 | System for judging code number abnormity |
CN111465021A (en) * | 2020-04-01 | 2020-07-28 | 北京中亦安图科技股份有限公司 | Graph-based crank call identification model construction method |
CN112261654A (en) * | 2020-09-23 | 2021-01-22 | 中国地质大学(武汉) | Method and system for generating mobile phone number white list in telecommunication anti-fraud process |
CN112417311A (en) * | 2020-10-29 | 2021-02-26 | 上海淇玥信息技术有限公司 | Method and device for executing service based on influence factor and electronic equipment |
CN113037699A (en) * | 2019-12-25 | 2021-06-25 | 中国电信股份有限公司 | Communication interception method, apparatus and computer readable storage medium |
CN113286035A (en) * | 2021-05-14 | 2021-08-20 | 国家计算机网络与信息安全管理中心 | Abnormal call detection method, device, equipment and medium |
CN113301210A (en) * | 2021-04-16 | 2021-08-24 | 珠海高凌信息科技股份有限公司 | Method and device for preventing harassing call based on neural network and electronic equipment |
CN113315874A (en) * | 2020-02-26 | 2021-08-27 | 卡巴斯基实验室股份制公司 | System and method for call classification |
CN113452845A (en) * | 2020-03-26 | 2021-09-28 | 中国移动通信集团福建有限公司 | Method and electronic equipment for identifying abnormal telephone number |
CN113905134A (en) * | 2021-10-21 | 2022-01-07 | 中国联合网络通信集团有限公司 | Address list blacklist management method, system, equipment and medium based on block chain |
CN113935758A (en) * | 2020-07-14 | 2022-01-14 | 中国移动通信集团广东有限公司 | Training method and device of random forest model for predicting handling probability of broadband service |
CN113992798A (en) * | 2021-10-26 | 2022-01-28 | 中国联合网络通信集团有限公司 | Telephone identification method, device, equipment and readable storage medium |
CN114125155A (en) * | 2021-11-15 | 2022-03-01 | 天津市国瑞数码安全系统股份有限公司 | Crank call detection method and system based on big data analysis |
CN114189585A (en) * | 2020-09-14 | 2022-03-15 | 中国移动通信集团重庆有限公司 | Crank call abnormity detection method and device and computing equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104023109A (en) * | 2014-06-27 | 2014-09-03 | 深圳市中兴移动通信有限公司 | Incoming call prompt method and device as well as incoming call classifying method and device |
CN106686261A (en) * | 2017-01-19 | 2017-05-17 | 腾讯科技(深圳)有限公司 | Information processing method and system |
CN107273531A (en) * | 2017-06-28 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | Telephone number classifying identification method, device, equipment and storage medium |
CN107306306A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Communicating number processing method and processing device |
CN107331385A (en) * | 2017-07-07 | 2017-11-07 | 重庆邮电大学 | A kind of identification of harassing call and hold-up interception method |
CN107517463A (en) * | 2016-06-15 | 2017-12-26 | 中国移动通信集团浙江有限公司 | A kind of recognition methods of telephone number and device |
CN107835496A (en) * | 2017-11-24 | 2018-03-23 | 北京奇虎科技有限公司 | A kind of recognition methods of refuse messages, device and server |
-
2018
- 2018-04-24 CN CN201810372550.4A patent/CN110401779B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104023109A (en) * | 2014-06-27 | 2014-09-03 | 深圳市中兴移动通信有限公司 | Incoming call prompt method and device as well as incoming call classifying method and device |
CN107306306A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Communicating number processing method and processing device |
CN107517463A (en) * | 2016-06-15 | 2017-12-26 | 中国移动通信集团浙江有限公司 | A kind of recognition methods of telephone number and device |
CN106686261A (en) * | 2017-01-19 | 2017-05-17 | 腾讯科技(深圳)有限公司 | Information processing method and system |
CN107273531A (en) * | 2017-06-28 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | Telephone number classifying identification method, device, equipment and storage medium |
CN107331385A (en) * | 2017-07-07 | 2017-11-07 | 重庆邮电大学 | A kind of identification of harassing call and hold-up interception method |
CN107835496A (en) * | 2017-11-24 | 2018-03-23 | 北京奇虎科技有限公司 | A kind of recognition methods of refuse messages, device and server |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113037699B (en) * | 2019-12-25 | 2022-11-29 | 中国电信股份有限公司 | Communication interception method, device and computer readable storage medium |
CN113037699A (en) * | 2019-12-25 | 2021-06-25 | 中国电信股份有限公司 | Communication interception method, apparatus and computer readable storage medium |
CN111144336A (en) * | 2019-12-30 | 2020-05-12 | 贵州近邻宝科技有限公司 | Automatic identification method for mobile phone number and invoice number of addressee facing to express bill |
CN113315874A (en) * | 2020-02-26 | 2021-08-27 | 卡巴斯基实验室股份制公司 | System and method for call classification |
CN113315874B (en) * | 2020-02-26 | 2024-03-19 | 卡巴斯基实验室股份制公司 | System and method for call classification |
CN113452845B (en) * | 2020-03-26 | 2024-03-19 | 中国移动通信集团福建有限公司 | Method for identifying abnormal telephone number and electronic equipment |
CN113452845A (en) * | 2020-03-26 | 2021-09-28 | 中国移动通信集团福建有限公司 | Method and electronic equipment for identifying abnormal telephone number |
CN111432078B (en) * | 2020-03-27 | 2021-09-10 | 中国—东盟信息港股份有限公司 | System for judging code number abnormity |
CN111432078A (en) * | 2020-03-27 | 2020-07-17 | 中国—东盟信息港股份有限公司 | System for judging code number abnormity |
CN111465021A (en) * | 2020-04-01 | 2020-07-28 | 北京中亦安图科技股份有限公司 | Graph-based crank call identification model construction method |
CN111465021B (en) * | 2020-04-01 | 2023-06-09 | 北京中亦安图科技股份有限公司 | Graph-based crank call identification model construction method |
CN113935758A (en) * | 2020-07-14 | 2022-01-14 | 中国移动通信集团广东有限公司 | Training method and device of random forest model for predicting handling probability of broadband service |
CN114189585A (en) * | 2020-09-14 | 2022-03-15 | 中国移动通信集团重庆有限公司 | Crank call abnormity detection method and device and computing equipment |
CN112261654B (en) * | 2020-09-23 | 2021-08-03 | 中国地质大学(武汉) | Method and system for generating mobile phone number white list in telecommunication anti-fraud process |
CN112261654A (en) * | 2020-09-23 | 2021-01-22 | 中国地质大学(武汉) | Method and system for generating mobile phone number white list in telecommunication anti-fraud process |
CN112417311A (en) * | 2020-10-29 | 2021-02-26 | 上海淇玥信息技术有限公司 | Method and device for executing service based on influence factor and electronic equipment |
CN113301210B (en) * | 2021-04-16 | 2023-05-23 | 珠海高凌信息科技股份有限公司 | Method and device for preventing harassment call based on neural network and electronic equipment |
CN113301210A (en) * | 2021-04-16 | 2021-08-24 | 珠海高凌信息科技股份有限公司 | Method and device for preventing harassing call based on neural network and electronic equipment |
CN113286035B (en) * | 2021-05-14 | 2022-12-30 | 国家计算机网络与信息安全管理中心 | Abnormal call detection method, device, equipment and medium |
CN113286035A (en) * | 2021-05-14 | 2021-08-20 | 国家计算机网络与信息安全管理中心 | Abnormal call detection method, device, equipment and medium |
CN113905134A (en) * | 2021-10-21 | 2022-01-07 | 中国联合网络通信集团有限公司 | Address list blacklist management method, system, equipment and medium based on block chain |
CN113992798A (en) * | 2021-10-26 | 2022-01-28 | 中国联合网络通信集团有限公司 | Telephone identification method, device, equipment and readable storage medium |
CN114125155A (en) * | 2021-11-15 | 2022-03-01 | 天津市国瑞数码安全系统股份有限公司 | Crank call detection method and system based on big data analysis |
Also Published As
Publication number | Publication date |
---|---|
CN110401779B (en) | 2022-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110401779A (en) | A kind of method, apparatus and computer readable storage medium identifying telephone number | |
Umayaparvathi et al. | A survey on customer churn prediction in telecom industry: Datasets, methods and metrics | |
US9503571B2 (en) | Systems, methods, and media for determining fraud patterns and creating fraud behavioral models | |
CN109600752A (en) | A kind of method and apparatus of depth cluster swindle detection | |
CN107248082B (en) | Card maintenance identification method and device | |
CN109688275A (en) | Harassing call recognition methods, device and storage medium | |
CN108462785B (en) | Method and device for processing malicious call | |
CN107517463A (en) | A kind of recognition methods of telephone number and device | |
CN108449482A (en) | The method and system of Number Reorganization | |
CN108093405A (en) | A kind of fraudulent call number analysis method and apparatus | |
CN110533085A (en) | With people's recognition methods and device, storage medium, computer equipment | |
CN110519264A (en) | Method, device and equipment for tracing attack event | |
CN107368856A (en) | Clustering method and device, the computer installation and readable storage medium storing program for executing of Malware | |
CN108629379A (en) | A kind of individual's reference appraisal procedure and system | |
CN107153584A (en) | Method for detecting abnormality and device | |
CN110020099A (en) | A kind of the user's recommended method and device of video friend-making | |
CN110611655B (en) | Blacklist screening method and related product | |
KR20170006158A (en) | System and method for detecting fraud usage of message | |
CN113923048A (en) | Network attack behavior identification method, device, equipment and storage medium | |
CN106162586A (en) | Method for limiting incoming call, device and system | |
CN108777749A (en) | A kind of fraudulent call recognition methods and device | |
CN113065748A (en) | Business risk assessment method, device, equipment and storage medium | |
CN111898931A (en) | Variable-based strategic wind control engine implementation method and device and computer equipment | |
CN109587248A (en) | User identification method, device, server and storage medium | |
CN110210884A (en) | Determine the method, apparatus, computer equipment and storage medium of user characteristic data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |