CN108449482A - The method and system of Number Reorganization - Google Patents

The method and system of Number Reorganization Download PDF

Info

Publication number
CN108449482A
CN108449482A CN201810135848.3A CN201810135848A CN108449482A CN 108449482 A CN108449482 A CN 108449482A CN 201810135848 A CN201810135848 A CN 201810135848A CN 108449482 A CN108449482 A CN 108449482A
Authority
CN
China
Prior art keywords
telephone number
score value
label
threshold
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810135848.3A
Other languages
Chinese (zh)
Inventor
王波
杨帆
杨优
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Teddy Bear Mobile Technology Co Ltd
Original Assignee
Beijing Teddy Bear Mobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Teddy Bear Mobile Technology Co Ltd filed Critical Beijing Teddy Bear Mobile Technology Co Ltd
Priority to CN201810135848.3A priority Critical patent/CN108449482A/en
Publication of CN108449482A publication Critical patent/CN108449482A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/66Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
    • H04M1/667Preventing unauthorised calls from a telephone set
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2281Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Technology Law (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of method and systems of Number Reorganization.Wherein, this method includes:Obtain the characteristic information of telephone number;Based on number machine learning model, the label score value of telephone number is determined according to characteristic information, wherein label score value is labeled correct probability for characterizing telephone number;Whether judge mark score value is more than or equal to first threshold;In the case where marking score value to be more than or equal to first threshold, the label information of telephone number is exported.The present invention solves the scheme for realizing Number Reorganization by third party application, by user's active flag in the prior art, leads to the technical problem that the quality of data is poor.

Description

The method and system of Number Reorganization
Technical field
The present invention relates to the communications fields, in particular to a kind of method and system of Number Reorganization.
Background technology
In recent years, popularizing with smart mobile phone technology, mobile phone plays irreplaceable in people’s lives and work Role.Number swindle, short message fraud cause huge property loss to user.Recently, telecommunications overseas is being hit by Ministry of Industry and Information In the new regulation of swindle publication, explicitly indicate that intercept to pretending to be the phone of the number of changing overseas of public security organs to realize all before the end of the year, And it researchs and solves and does not show the measures such as number.Along with perpetrator usually hides oneself overseas, not only allow ordinary user that can not take precautions against It is cheated, while also clearing up a cace to relevant department and bringing difficulty.Only 2016 first half of the year overseas throw the domestic number of changing swindle electricity into Words just have 4,700,000,000, and every year overseas by economic loss caused by the number of changing phone implementation telecommunication fraud, it has been more than 10,000,000,000 Member.Thus, effectively the problems such as solution telecommunication fraud, short message fraud, is extremely urgent.
Currently, identifying that field, existing solution are defended by third-party APP, safety in telephone fraud Scholar (for example, Tencent security guard) etc. realizes.This scheme is since number data amount is limited, to multi-platform multi-field branch It holds, it is perfect not enough, there is larger impact for the important indicator discrimination of data assessment.Secondly as relying on user actively Marking telephone number so that number mark data, which generate, has certain randomness, cannot rely on effective technological means, because And it is difficult to ensure that the quality of flag data.In addition, need to be mounted so that third party APP, greatly increases the door of user installation Sill, user's conversion ratio be not high;User experience is reduced by user's active flag, the overload of information flow can bring additional be stranded It disturbs, and leaves bad public praise.
In for the above-mentioned prior art Number Reorganization is realized by third party application, by user's active flag Scheme leads to the problem that the quality of data is poor, and currently no effective solution has been proposed.
Invention content
An embodiment of the present invention provides a kind of method and systems of Number Reorganization, at least to solve in the prior art by Tripartite's application program, the scheme that Number Reorganization is realized by user's active flag, lead to the technical problem that the quality of data is poor.
One side according to the ... of the embodiment of the present invention provides a kind of method of Number Reorganization, including:Obtain telephone number Characteristic information;Based on number machine learning model, the label score value of telephone number is determined according to characteristic information, wherein label Score value is labeled correct probability for characterizing telephone number;Whether judge mark score value is more than or equal to first threshold;It is marking In the case that score value is more than or equal to first threshold, the label information of telephone number is exported.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of system of Number Reorganization, including:At least one visitor Family end equipment, for send number inquiry request, wherein number inquiry request include:At least one telephone number;Server, It is communicated at least one client device, the characteristic information for obtaining telephone number, is based on number machine learning model, according to Characteristic information determines whether the label score value of telephone number, judge mark score value are more than or equal to first threshold, and in label score value In the case of more than or equal to first threshold, the label information of telephone number is exported, wherein label score value is for characterizing telephone number It is labeled correct probability.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of device of Number Reorganization, including:Acquiring unit, Characteristic information for obtaining telephone number;Determination unit is determined for being based on number machine learning model according to characteristic information The label score value of telephone number, wherein label score value is labeled correct probability for characterizing telephone number;Judging unit is used Whether it is more than or equal to first threshold in judge mark score value;First execution unit, for being more than or equal to the first threshold in label score value In the case of value, the label information of telephone number is exported.
Another aspect according to the ... of the embodiment of the present invention, additionally provides a kind of storage medium, and storage medium includes the journey of storage Sequence, wherein the method that program executes a kind of above-mentioned Number Reorganization.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of processor, and processor is used to run program, In, program run when execute a kind of above-mentioned Number Reorganization method.
In embodiments of the present invention, pass through the characteristic information of acquisition telephone number;Based on number machine learning model, according to Characteristic information determines the label score value of telephone number, wherein label score value is labeled correct probability for characterizing telephone number; Whether judge mark score value is more than or equal to first threshold;In the case where marking score value to be more than or equal to first threshold, phone is exported The label information of number has reached and machine learning is applied to Number Reorganization to improve the mesh of the accuracy of telephone number identification , occur to realize the phenomenon that reducing fraudulent call and fraud text message, and improve the technique effect of user experience, in turn It solves the scheme for realizing Number Reorganization by third party application, by user's active flag in the prior art, causes The poor technical problem of the quality of data.
Description of the drawings
Attached drawing described herein is used to provide further understanding of the present invention, and is constituted part of this application, this hair Bright illustrative embodiments and their description are not constituted improper limitations of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is a kind of method flow diagram of Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 2 is a kind of method flow diagram of optional Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 3 is according to the ... of the embodiment of the present invention a kind of preferably using the machine learning progress online exchange architecture of Number Reorganization Schematic diagram;
Fig. 4 is a kind of machine learning model schematic diagram being preferably used in Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 5 is a kind of block mold of optionally machine learning based on convolutional neural networks according to the ... of the embodiment of the present invention Schematic diagram;
Fig. 6 is a kind of system schematic of Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 7 is a kind of schematic device of Number Reorganization according to the ... of the embodiment of the present invention.
Specific implementation mode
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The every other embodiment that member is obtained without making creative work should all belong to the model that the present invention protects It encloses.
It should be noted that term " first " in description and claims of this specification and above-mentioned attached drawing, " Two " etc. be for distinguishing similar object, without being used to describe specific sequence or precedence.It should be appreciated that using in this way Data can be interchanged in the appropriate case, so as to the embodiment of the present invention described herein can in addition to illustrating herein or Sequence other than those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover It includes to be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment to cover non-exclusive Those of clearly list step or unit, but may include not listing clearly or for these processes, method, product Or the other steps or unit that equipment is intrinsic.
According to embodiments of the present invention, a kind of embodiment of the method for Number Reorganization is provided, it should be noted that in attached drawing The step of flow illustrates can execute in the computer system of such as a group of computer-executable instructions, although also, Logical order is shown in flow chart, but in some cases, it can be to execute shown different from sequence herein or retouch The step of stating.
Fig. 1 is a kind of method flow diagram of Number Reorganization according to the ... of the embodiment of the present invention, as shown in Figure 1, this method includes Following steps:
Step S102 obtains the characteristic information of telephone number.
As a kind of optional embodiment, above-mentioned telephone number can be for telephone set (including fixed-line telephone and mobile electricity Words) between communicated and the number that sets, can be domestic call, can also be overseas calls, be the country in telephone number In the case of phone, including but not limited to China Mobile, any one operator of China Unicom and China Telecom provide phone Number.Features described above information can be the information for the label information for determining above-mentioned telephone number chosen, including but unlimited In the type of number, number length, incoming call frequency, exhalation frequency, average call duration, averagely breathe out duration, average incoming call duration Etc. information.
Step S104 is based on number machine learning model, the label score value of telephone number is determined according to characteristic information, In, label score value is labeled correct probability for characterizing telephone number.
As a kind of optional embodiment, above-mentioned number machine learning model can be instructed by various machine learning algorithms The label information of the telephone number for identification got and corresponding label score value, wherein label information includes but is limited to It is following any one:Ad promotions, fraudulent call, harassing call, Courier Service etc..Above-mentioned label score value can pass through number The score value that machine learning model assesses the label information of telephone number, range can be any one in 0~100 Score value.
Optionally, above-mentioned number machine learning model is using any one following machine learning algorithm:Random forests algorithm, Algorithm of support vector machine, convolutional neural networks algorithm, Rogers spy's regression algorithm.
In a kind of optional embodiment, the label of telephone number can be determined according to the following characteristic information of telephone number Score value:Incoming call frequency, average call duration, averagely breathes out duration, average incoming call duration at exhalation frequency.
Whether step S106, judge mark score value are more than or equal to first threshold.
As a kind of optional embodiment, above-mentioned first threshold can be pre-set labeled for characterizing telephone number Correct minimum mark score value, for example, above-mentioned first threshold can be 70 points.Based on number machine learning model, according to feature After information determines the label score value of telephone number, it can be determined that whether the label score value of the telephone number is more than or equal to first threshold (for example, 70 points), if the label score value of telephone number is more than or equal to first threshold, then it is assumed that the labeled accuracy of the phone It is relatively high;, whereas if the label score value of telephone number is less than first threshold, then it is assumed that the labeled accuracy of the phone is compared It is low.
Step S108 exports the label information of telephone number in the case where marking score value to be more than or equal to first threshold.
It is easy it is noted that in the prior art, by user directly by third party application (for example, various safety are defended Scholar) telephone number is marked, it is this completely may be not necessarily accurate by the telephone number that user subjectivity marks, and this Shen Please scheme disclosed in above-described embodiment, according to the characteristic information of the telephone number got come the label information to telephone number into Row marking determines whether to export the label of the telephone number according to the corresponding label score value of the label information of each telephone number Information can further increase telephone number and be labeled correct probability.
From the foregoing, it will be observed that in the above embodiments of the present application, after receiving user to the inquiry request of a certain telephone number, The characteristic information for the telephone number asked in inquiry request, and the number machine learning mould obtained based on advance training can be obtained Type determines the label score value of the telephone number, only the label score value of the telephone number according to the characteristic information of the telephone number After first threshold, the label information of the telephone number is just exported, has reached and machine learning is applied to Number Reorganization To improve the purpose of the accuracy of telephone number identification, occur to realize the phenomenon that reducing fraudulent call and fraud text message, And the technique effect of user experience is improved, and then solve in the prior art by third party application, by using householder It is dynamic to mark to realize the scheme of Number Reorganization, lead to the technical problem that the quality of data is poor.
It should be noted that since the telephone number received may be marked telephone number, it is also possible to not mark The telephone number of note, as shown in Fig. 2, being based on number machine learning model, is believed as a kind of optional embodiment according to feature Breath determines the label score value of telephone number, may include steps of:
Step S202 judges whether telephone number is marked number, wherein marked number is for characterizing telephone number Label information is carried;
Step S204 is based on number machine learning model, according to feature in the case where telephone number is marked number Information determines that telephone number is marked as the label score value of label information.
Optionally, as shown in Fig. 2, the above method can also include:Step S206 is not marked number in telephone number In the case of, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information, and be marked as The label score value of label information.
Optionally, the above method can also include the following steps:Step S110 is less than the feelings of first threshold in label score value Under condition, forbid the label information for exporting telephone number.
It should be noted that in the case where the label score value of telephone number is less than first threshold, then the telephone number Label information is not necessarily accurate, therefore, it is possible to forbid exporting the label information of the telephone number.Optionally, as a kind of optional Embodiment, mark score value be less than first threshold in the case of, forbid export telephone number label information, may include Following steps:
Whether step 1, judge mark score value are less than second threshold, wherein second threshold is less than first threshold;
Step 2 changes the label information of telephone number in the case where marking score value to be less than or equal to second threshold.
Specifically, in above-mentioned steps, second threshold can be used for characterizing a label of the labeled mistake of telephone number Score value, second threshold is less than first threshold, for example, second threshold can be 30 points.Thus, it is small in the label score value of telephone number In the case of second threshold, it may be determined that the telephone number is labeled mistake, needs the label information for removing the telephone number, Or the label information for being revised as determining based on above-mentioned number machine learning model by the label information of the telephone number.
Based on any one of the above optional embodiment, as a kind of optional embodiment, in the spy for obtaining telephone number Before reference breath, the above method can also include:Receive the number inquiry request that at least one client device uploads, wherein Number inquiry is asked:At least one telephone number.
Optionally, it is based on above-described embodiment, as an alternative embodiment, being uploaded receiving at least one client Number inquiry request after, the above method can also include:It is defeated at least one client device by online recognition engine Go out the label information of telephone number.
Embodiment as one preferred, Fig. 3 be it is according to the ... of the embodiment of the present invention it is a kind of preferably using machine learning into The online exchange architecture schematic diagram of row Number Reorganization, as shown in figure 3, machine learning mainly has two big functions on Number Reorganization:One, Verify data with existing;Two, unidentified data are prejudged.The request that data source is initiated in client, client send out request of data Online recognition engine is given, by the processing of big data computing platform and original system data platform, result is exported to online Identify engine.Online recognition engine exports result to be shown to client.In addition, for the data of user's mark, utilize Machine learning is given a mark, such as 10086 are labeled as " harassing call ", then system can be analyzed according to 10086 number feature 10086 be the score of " harassing call " label.When the label score value of telephone number less than second threshold (for example, 30 points) when It waits, system thinks that user is inaccurate to the label of this telephone number, when the label score value of telephone number is higher than the first threshold When being worth (for example, 70 points), system thinks that user to the label of this telephone number is correct.When the label of telephone number Score value be more than second threshold (for example, 30 points), and less than first threshold (for example, 70 points) when, then not to client export The label information of the telephone number, using the label information as a part for database.
Optionally, a kind of optional embodiment, Fig. 4 are that one kind according to the ... of the embodiment of the present invention is preferably used in Number Reorganization Machine learning model schematic diagram, as shown in figure 4, interface library is given a mark to each number, is gone according to the differentiation logic of system on line It verifies the label of marked number and marking is marked to unmarked number.After the request of client is initiated, pass through mobile phone The telephony interface of manufacturer sends a request to on-line checking interface, and all kinds of numbers are arranged according to number library inquiry logic later Threshold parameter judged, when higher than first threshold, export the label of telephone number.When less than second threshold, remove The label of telephone number.
It should be noted that when building above-mentioned number machine learning model for Number Reorganization, may be used at random Forest (random forest), SVM (support vector machines), CNN (convolutional neural networks), Logistic regression (sieve Jie Site return) etc. any one machine learning algorithm, machine learning model effect it is as follows:
(1) real class rate (true positive rate, TPR), calculation formula is:TPR=TP/ (TP+FN) is used for table The positive example that sign grader is identified accounts for the ratio of all positive examples.
(2) negative and positive class rate (false positive rate, FPR), calculation formula is:FPR=FP/ (FP+TN), is used for Characterization grader misdeems that the negative example for positive class accounts for the ratio of all negative examples.
Generally, KS values are bigger, indicate that model can be bigger by the separated degree of positive and negative classification.KS>0.2 indicates Model has preferable forecasting accuracy.Statistical result as shown in Table 1 indicate KS values it is maximum be random forest (random Forest KS values) are 0.59, and the followed by KS values of support vector machines (SVM) are 0.56, can choose KS values and TPR values are higher Value, be marked classification artificial accuracy verification.
Machine learning model KS values TPR FPR
Random forests algorithm 0.51 79.38% 20.92%
Algorithm of support vector machine 0.56 79.33% 24.38%
Convolutional neural networks algorithm 0.58 80.20% 24.47%
Rogers spy's regression algorithm 0.44 67.54% 23.06%
It should be noted that in machine learning, sample can be generally divided into independent three parts:Training set (train Set), verification collection (validation set) and test set (test set).Wherein, training set is used for estimating model, verification collection For determining the parameter of network structure or Controlling model complexity, and the model that test set then examines final choice optimal How is performance.Since training set is for establishing model, training set accounts for the 50% of total sample, and it is other respectively account for 25%, three parts are all It is to be randomly selected from sample.Therefore, the characteristic of machine learning depends on the data accuracy of training set.In verification number Accuracy rate above, the data of label are truer, and the parameter of machine learning model is closer to actual value.But unlike that common The required data set of machine learning model, required data set is to need manually to go to verify its accuracy just to can guarantee herein The correctness of data set.Therefore there are following two limitations for training set:One, the data of training set are more not enough;Two, training set Data it is accurate not enough.On the data bulk of training set, we use the classification marker of each data, point as data Class is handled.
Since the purpose of the model of construction number machine learning is desirable to concentrate from original characteristic, study is gone wrong Structure and problem essence, certainly the feature picked out at this time should be able to just have better explanation to problem, so special Levy the target of selection approximately as:
1. improving the accuracy of prediction;
2. constructing faster, lower prediction model is consumed;
3. can model be better understood from and be explained.
The selection of feature is very crucial for the model training of machine learning.More feature descriptions are it is meant that have It is more the information that grader can use, but it is not intended that feature is The more the better, more features means to reach To the reduction of convergent speed, more redundancy feature can be followed by caused.Feature Selection has following three points IIS laws:
I:Informative both contains effective information
I:Independent was both independent from each other between feature
S:Simple, both information must be easy to extract it is readily comprehensible
To sum up, incoming call frequency (incoming frequency), exhalation frequency (outcoming are chosen in Number Reorganization Frequency), average call duration (average talk time) averagely breathes out duration (average outcoming Time) features such as average incoming call duration (average incoming time) are as the aspect of model.
Embodiment as one preferred, comprehensive KS and TPR is in random forest, support vector machines, CNN and Logistic's Performance, can choose machine learning models of the CNN as Number Reorganization.
Below by taking CNN as an example, process is established introduce number machine learning model.
First, table 2 show the professional term that CNN is related to.
Fig. 5 is a kind of block mold of optionally machine learning based on convolutional neural networks according to the ... of the embodiment of the present invention Schematic diagram, as shown in figure 5, including following several parts:
(1) data import (import data)
Using the import functions of python, the required training set of Number Reorganization is imported.This partial data comes from Thailand The data for the single classification that the database of enlightening bear movement is accumulated.Wherein, the code that data import is realized as follows:
#Import data
Rec, rec_col, lab, lab_col=read_train_file (FLAGS.data_dir)
Rec_nd=numpy.array (rec)
Lab_nd=numpy.array (lab)
#print(type(rec),rec[0])
#print(type(lab),lab[0])
#print(type(rec_nd),len(rec_nd),rec_nd[0])
#print(type(lab_nd),len(lab_nd),lab_nd[0])
#sys.exit()
Rec_test, rec_col_test, lab_test, lab_col_test=read_test_file (FLAGS.data_dir)
Rec_test_nd=numpy.array (rec_test)
Lab_test_nd=numpy.array (lab_test)
(2) node (create op) is created
By exporting classification for input picture and target and creating node, to start to build calculating figure.Pass through weight W and biasing B, to build mapping relations between the two, x and y_ here are not to be specifically worth.During initialization, occupy-place is used It accords with placeholder and creates the rec_col dimensional vectors that a data type is float32.None indicates that its value size is indefinite, Herein as first dimension values, to refer to the size of batch, imply that the quantity of x is indefinite.Similarly, in y_, None is indicated Its value size is indefinite, herein as first dimension values, to refer to the size of batch, implies that the quantity of y_ is indefinite, lac_ Col is the value of second dimension, and size lac_col, None and lac_col collectively form the bivector of biasing b.shape Parameter be optional, but have the presence of shape that can be captured automatically because wrong caused by data dimension is inconsistent Accidentally.Wherein, the code realization for creating node is as follows:
(3) first layer convolution (First Convolutional Layer)
First layer convolution kernel is 5*5, and RGB channel number is 1, is exported as the data of 32 16*16.X_image and weights Vectorial W_conv1 carries out convolution, in addition bias term b_conv1, then applies ReLU activation primitives, finally carries out the pond of 2*2 (max pooling)。
(4) second layer convolution (Second Convolutional Layer)
Second layer convolution kernel is 5*5, and RGB channel number is 1.X_image and weight vector W_conv1 are carried out convolution, added Then upper bias term b_conv1 applies ReLU activation primitives, finally carry out pond (max pooling), export as 64 16* 16 tensor.
(5) full articulamentum (Densely Connected Layer)
Data size is set to 1x1, a full articulamentum for there are 1024 neurons is added, for handling data.We Tensor reshape the output of pond layer is vectorial at some, is multiplied by weight matrix, in addition biasing, is then used for ReLU letters Number (activation primitive rectified linear unites).
(6) over-fitting layer (Dropout Layer) is prevented
In order to reduce over-fitting, dropout (preventing over-fitting layer) is added before output layer.With one Placeholder (placeholder) represents the output of a neuron.Dropout can be enabled in the training process in this way, Dropout is closed in test process.Dropout is operated other than it can shield the output of neuron, can also automatically process nerve The scale (scale) of first output valve.So with the update and processing that can not have to consider scale (scale) when dropout.
(7) output layer (Readout Layer)
Traditional machine learning model logestic is the special case of softmax (and if only if only there are two classify It is small), softmax is the more classification problems of processing, adds one softmax layers herein and (makees Drop layers of data for exporting To input x, the data Y of output layer is obtained by Y=wx+b).
(8) training pattern and assessment models (Trian The Model&Value the Model)
In order to be trained and assess, using with single layer SoftMax neural network models, with more complicated ADAM optimize Device come do gradient decline optimization, additional parameter keep_prob is added in feed_dict to control dropout ratios.So Every 100 iteration export a daily record afterwards.
Machine learning is applied to Number Reorganization field by the scheme that the above embodiments of the present application provide, and is Number Reorganization Development opens new road.Application of the machine learning model in number field has the advantage that:
Process firstly, for production label is greatly advantageous so that the process of production is not only to depend on platform User's active flag and third-party platform provide, significantly improve data production efficiency.
Secondly, machine learning plays the role of verifying label in label production process, and such a reaction type is adjusted The stability of one system is highly desirable, the quality and correctness of data are improved.
In addition, application of the machine learning in Number Reorganization field, can adequately excavate the value of data and be used.
According to embodiments of the present invention, a kind of system embodiment of the method for realizing above-mentioned Number Reorganization is additionally provided, Fig. 6 is a kind of system schematic of Number Reorganization according to the ... of the embodiment of the present invention, as shown in fig. 6, the system includes:It is at least one Client device 601 and server 603.
Wherein, at least one client device 601, for sending number inquiry request, wherein number inquiry request bag It includes:At least one telephone number;
Server 603 is communicated at least one client device, the characteristic information for obtaining telephone number, based on number Code machine learning model, determines the label score value of telephone number according to characteristic information, and whether judge mark score value is more than or equal to the One threshold value, and in the case where marking score value to be more than or equal to first threshold, export the label information of telephone number, wherein label Score value is labeled correct probability for characterizing telephone number.
From the foregoing, it will be observed that in the above embodiments of the present application, client device 601 sends any one electricity to server 603 The number inquiry request of number is talked about, server 603 can be obtained and be looked into after receiving user to the inquiry request of the telephone number The characteristic information for the telephone number asked in request, and the number machine learning model obtained based on advance training are ask, according to this The characteristic information of telephone number determines the label score value of the telephone number, and only the label score value of the telephone number is more than or equal to the After one threshold value, the label information of the telephone number is just exported, has reached and machine learning is applied to Number Reorganization to improve phone The purpose of the accuracy of Number Reorganization occurs to realize the phenomenon that reducing fraudulent call and fraud text message, and improves and use The technique effect of family experience, and then solve and come in fact by third party application, by user's active flag in the prior art The scheme of existing Number Reorganization, leads to the technical problem that the quality of data is poor.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to judge whether telephone number is marked number, Wherein, marked number has carried label information for characterizing telephone number;The case where telephone number is marked number Under, it is based on number machine learning model, determines that telephone number is marked as the label score value of label information according to characteristic information.
In a kind of optional embodiment, above-mentioned server 603 be additionally operable to be not in telephone number marked number feelings Under condition, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information, and be marked as marking The label score value of information.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to the case where marking score value to be less than first threshold Under, forbid the label information for exporting telephone number.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to whether judge mark score value is less than second threshold, Wherein, second threshold is less than first threshold;In the case where marking score value to be less than or equal to second threshold, the mark of telephone number is changed Remember information.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to receive what at least one client device uploaded Number inquiry ask, wherein number inquiry request include:At least one telephone number.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to through online recognition engine at least one visitor The label information of family end equipment output telephone number.
Based on the optional system embodiment of any one of the above, as a kind of optional embodiment, above-mentioned telephone number Feature includes at least one following:Incoming call frequency, exhalation frequency, average call duration, when averagely breathing out duration, average incoming call It is long.
Based on the optional system embodiment of any one of the above, as a kind of optional embodiment, above-mentioned number engineering Model is practised using any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks are calculated Method, Rogers spy's regression algorithm.
According to embodiments of the present invention, a kind of device embodiment of the method for realizing above-mentioned Number Reorganization is additionally provided, Fig. 7 is a kind of schematic device of Number Reorganization according to the ... of the embodiment of the present invention, as shown in fig. 7, the device includes:Acquiring unit 701, determination unit 703, judging unit 705 and the first execution unit 707.
Wherein, acquiring unit 701, the characteristic information for obtaining telephone number;
Determination unit 703 determines the label point of telephone number according to characteristic information for being based on number machine learning model Value, wherein label score value is labeled correct probability for characterizing telephone number;
Whether judging unit 705 is more than or equal to first threshold for judge mark score value;
First execution unit 707, in the case where marking score value to be more than or equal to first threshold, exporting telephone number Label information.
Herein it should be noted that above-mentioned acquiring unit 701, determination unit 703, judging unit 705 and first execute list Member 707 corresponds to the step S102 to S108 in embodiment of the method, the example and answer that above-mentioned module and corresponding step are realized It is identical with scene, but it is not limited to above method embodiment disclosure of that.It should be noted that above-mentioned module is as device A part can execute in the computer system of such as a group of computer-executable instructions.
From the foregoing, it will be observed that in the above embodiments of the present application, after receiving user to the inquiry request of a certain telephone number, The characteristic information for the telephone number asked in inquiry request is obtained by acquiring unit 701, and is based in advance by determination unit 703 The number machine learning model that first training obtains determines the label point of the telephone number according to the characteristic information of the telephone number Value judges whether the label score value of the telephone number is more than or equal to first threshold by judging unit 705, list is executed by first Member 707 just exports the label letter of the telephone number in the case where the label score value of the telephone number is more than or equal to first threshold Breath has achieved the purpose that machine learning being applied to Number Reorganization to improve the accuracy of telephone number identification, to realize The phenomenon that reducing fraudulent call and fraud text message occurs, and improves the technique effect of user experience, and then solves existing skill The scheme for realizing Number Reorganization in art by third party application, by user's active flag, causes the quality of data poor The technical issues of.
In a kind of optional embodiment, above-mentioned determination unit includes:First judgment module, for judging that telephone number is No is marked number, wherein marked number has carried label information for characterizing telephone number;First determining module, For in the case where telephone number is marked number, being based on number machine learning model, phone being determined according to characteristic information Number is marked as the label score value of label information.
In a kind of optional embodiment, above-mentioned determination unit further includes:Second determining module, for telephone number not In the case of being marked number, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information, And it is marked as the label score value of label information.
In a kind of optional embodiment, above-mentioned apparatus further includes:Second execution unit, for being less than the in label score value In the case of one threshold value, forbid the label information for exporting telephone number.
In a kind of optional embodiment, above-mentioned second execution unit includes:Second judgment module, for judge mark point Whether value is less than second threshold, wherein second threshold is less than first threshold;Modified module, for being less than or equal in label score value In the case of second threshold, the label information of telephone number is changed.
In a kind of optional embodiment, above-mentioned apparatus further includes:Receiving unit is set for receiving at least one client The standby number inquiry request uploaded, wherein number inquiry request includes:At least one telephone number.
In a kind of optional embodiment, above-mentioned apparatus further includes:Output unit, for passing through online recognition engine to extremely The label information of few client device output telephone number.
Based on the optional device embodiment of any one of the above, as a kind of optional embodiment, above-mentioned telephone number Feature includes at least one following:Incoming call frequency, exhalation frequency, average call duration, when averagely breathing out duration, average incoming call It is long.
Based on the optional device embodiment of any one of the above, as a kind of optional embodiment, above-mentioned number engineering Model is practised using any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks are calculated Method, Rogers spy's regression algorithm.
According to embodiments of the present invention, a kind of storage medium is additionally provided, which is characterized in that storage medium includes the journey of storage Sequence, wherein a kind of optional or preferred method of Number Reorganization of any one of program execution above method embodiment.
According to embodiments of the present invention, a kind of processor is additionally provided, which is characterized in that processor is used to run program, In, execute that any one of above method embodiment is optional or a kind of preferred method of Number Reorganization when program is run.
The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.
In the above embodiment of the present invention, all emphasizes particularly on different fields to the description of each embodiment, do not have in some embodiment The part of detailed description may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, for example, the unit division, Ke Yiwei A kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can combine or Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Between coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module It connects, can be electrical or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple On unit.Some or all of unit therein can be selected according to the actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can be stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or network equipment etc.) execute each embodiment the method for the present invention whole or Part steps.And storage medium above-mentioned includes:USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can to store program code Medium.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (10)

1. a kind of method of Number Reorganization, which is characterized in that including:
Obtain the characteristic information of telephone number;
Based on number machine learning model, the label score value of the telephone number is determined according to the characteristic information, wherein described Label score value is labeled correct probability for characterizing the telephone number;
Judge whether the label score value is more than or equal to first threshold;
In the case where the label score value is more than or equal to the first threshold, the label information of the telephone number is exported.
2. according to the method described in claim 1, it is characterized in that, number machine learning model is based on, according to feature letter Breath determines the label score value of the telephone number, including:
Judge whether the telephone number is marked number, wherein the marked number is for characterizing the telephone number Label information is carried;
In the case where the telephone number is marked number, it is based on number machine learning model, according to the characteristic information Determine that the telephone number is marked as the label score value of the label information.
3. according to the method described in claim 2, it is characterized in that, judge the telephone number whether be marked number it Afterwards, the method further includes:
In the case where the telephone number is not marked number, it is based on number machine learning model, is believed according to the feature Breath determines the label information of the telephone number, and is marked as the label score value of the label information.
4. according to the method described in claim 1, it is characterized in that, judging whether the label score value is more than or equal to the first threshold After value, the method further includes:
In the case where the label score value is less than the first threshold, forbid the label information for exporting the telephone number.
5. according to the method described in claim 4, it is characterized in that, the case where the label score value is less than the first threshold Under, forbid the label information for exporting the telephone number, including:
Judge whether the label score value is less than second threshold, wherein the second threshold is less than the first threshold;
In the case where the label score value is less than or equal to the second threshold, the label information of the telephone number is changed.
6. according to the method described in claim 1, it is characterized in that, before the characteristic information for obtaining the telephone number, institute The method of stating further includes:
Receive the number inquiry request that at least one client device uploads, wherein the number inquiry request includes:At least one A telephone number.
7. according to the method described in claim 6, it is characterized in that, being asked receiving the number inquiry that at least one client uploads After asking, the method includes:
The label information of the telephone number is exported at least one client device by online recognition engine.
8. method as claimed in any of claims 1 to 7, which is characterized in that the feature of the telephone number includes It is at least one following:Incoming call frequency, average call duration, averagely breathes out duration, average incoming call duration at exhalation frequency.
9. method as claimed in any of claims 1 to 7, which is characterized in that the number machine learning model is adopted With any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks algorithm, Luo Jie This special regression algorithm.
10. a kind of system of Number Reorganization, which is characterized in that including:
At least one client device, for sending number inquiry request, wherein number inquiry request includes:At least one A telephone number;
Server communicates, the characteristic information for obtaining the telephone number at least one client device, based on number Code machine learning model, the label score value of the telephone number is determined according to the characteristic information, judges that the label score value is It is no to be more than or equal to first threshold, and in the case where the label score value is more than or equal to the first threshold, export the phone The label information of number, wherein the label score value is labeled correct probability for characterizing the telephone number.
CN201810135848.3A 2018-02-09 2018-02-09 The method and system of Number Reorganization Pending CN108449482A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810135848.3A CN108449482A (en) 2018-02-09 2018-02-09 The method and system of Number Reorganization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810135848.3A CN108449482A (en) 2018-02-09 2018-02-09 The method and system of Number Reorganization

Publications (1)

Publication Number Publication Date
CN108449482A true CN108449482A (en) 2018-08-24

Family

ID=63192162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810135848.3A Pending CN108449482A (en) 2018-02-09 2018-02-09 The method and system of Number Reorganization

Country Status (1)

Country Link
CN (1) CN108449482A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109688275A (en) * 2018-12-27 2019-04-26 中国联合网络通信集团有限公司 Harassing call recognition methods, device and storage medium
CN110086943A (en) * 2019-04-29 2019-08-02 北京羽乐创新科技有限公司 Number monitoring method and device
CN110113471A (en) * 2019-04-29 2019-08-09 北京羽乐创新科技有限公司 Number monitoring method and device
CN110519466A (en) * 2019-08-30 2019-11-29 北京泰迪熊移动科技有限公司 A kind of express delivery number identification method, equipment and computer storage medium
CN110855843A (en) * 2019-10-12 2020-02-28 中国平安财产保险股份有限公司 Method and device for displaying incoming call of virtual number, storage medium and electronic equipment
CN111131593A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Crank call identification method and device
CN111131629A (en) * 2019-12-31 2020-05-08 宇龙计算机通信科技(深圳)有限公司 Crank call processing method and device, storage medium and terminal
WO2020134523A1 (en) * 2018-12-29 2020-07-02 中兴通讯股份有限公司 User identification method and device
CN111582786A (en) * 2020-04-29 2020-08-25 上海中通吉网络技术有限公司 Express bill number identification method, device and equipment based on machine learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050042546A (en) * 2003-11-03 2005-05-10 엄장필 System and method for managing spam mail
CN104320525A (en) * 2014-09-19 2015-01-28 小米科技有限责任公司 Method and device for identifying telephone number
CN104378480A (en) * 2013-11-15 2015-02-25 上海触乐信息科技有限公司 Phone number marking method and system
CN104683538A (en) * 2015-02-13 2015-06-03 广州市讯飞樽鸿信息技术有限公司 Harassment telephone number library construction method and system
CN104717674A (en) * 2014-12-02 2015-06-17 北京奇虎科技有限公司 Number attribute recognition method and device, terminal and server
CN105450825A (en) * 2015-11-24 2016-03-30 小米科技有限责任公司 Communication number classification mark method and device
JP2016058018A (en) * 2014-09-12 2016-04-21 キヤノン株式会社 Image processing method, image processing program and image processor
CN106550356A (en) * 2016-09-23 2017-03-29 深圳市金立通信设备有限公司 A kind of method and its device for determining caller ID type

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050042546A (en) * 2003-11-03 2005-05-10 엄장필 System and method for managing spam mail
CN104378480A (en) * 2013-11-15 2015-02-25 上海触乐信息科技有限公司 Phone number marking method and system
JP2016058018A (en) * 2014-09-12 2016-04-21 キヤノン株式会社 Image processing method, image processing program and image processor
CN104320525A (en) * 2014-09-19 2015-01-28 小米科技有限责任公司 Method and device for identifying telephone number
CN104717674A (en) * 2014-12-02 2015-06-17 北京奇虎科技有限公司 Number attribute recognition method and device, terminal and server
CN104683538A (en) * 2015-02-13 2015-06-03 广州市讯飞樽鸿信息技术有限公司 Harassment telephone number library construction method and system
CN105450825A (en) * 2015-11-24 2016-03-30 小米科技有限责任公司 Communication number classification mark method and device
CN106550356A (en) * 2016-09-23 2017-03-29 深圳市金立通信设备有限公司 A kind of method and its device for determining caller ID type

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111131593A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Crank call identification method and device
CN109688275A (en) * 2018-12-27 2019-04-26 中国联合网络通信集团有限公司 Harassing call recognition methods, device and storage medium
WO2020134523A1 (en) * 2018-12-29 2020-07-02 中兴通讯股份有限公司 User identification method and device
CN110086943A (en) * 2019-04-29 2019-08-02 北京羽乐创新科技有限公司 Number monitoring method and device
CN110113471A (en) * 2019-04-29 2019-08-09 北京羽乐创新科技有限公司 Number monitoring method and device
CN110113471B (en) * 2019-04-29 2021-08-31 北京羽乐创新科技有限公司 Number monitoring method and device
CN110519466A (en) * 2019-08-30 2019-11-29 北京泰迪熊移动科技有限公司 A kind of express delivery number identification method, equipment and computer storage medium
CN110855843A (en) * 2019-10-12 2020-02-28 中国平安财产保险股份有限公司 Method and device for displaying incoming call of virtual number, storage medium and electronic equipment
CN111131629A (en) * 2019-12-31 2020-05-08 宇龙计算机通信科技(深圳)有限公司 Crank call processing method and device, storage medium and terminal
CN111582786A (en) * 2020-04-29 2020-08-25 上海中通吉网络技术有限公司 Express bill number identification method, device and equipment based on machine learning

Similar Documents

Publication Publication Date Title
CN108449482A (en) The method and system of Number Reorganization
CN107306306A (en) Communicating number processing method and processing device
CN104683537B (en) A kind of number mark method and device
CN109190351A (en) On-line signature person identity authorization system based on mobile terminal, device and method
CN109688275A (en) Harassing call recognition methods, device and storage medium
CN110493476B (en) Detection method, device, server and storage medium
CN108256591A (en) For the method and apparatus of output information
CN110401780A (en) A kind of method and device identifying fraudulent call
CN111970400B (en) Crank call identification method and device
CN107993142A (en) A kind of anti-risk of fraud control system of finance
CN109840778A (en) The recognition methods of fraudulent user and device, readable storage medium storing program for executing
CN110381218A (en) A kind of method and device identifying telephone fraud clique
CN108153727A (en) Utilize the method for semantic mining algorithm mark sales calls and the system of improvement sales calls
Jamalian et al. A hybrid data mining method for customer churn prediction
CN115034305A (en) Method, system and storage medium for identifying fraudulent users in a speech network using a human-in-loop neural network
CN109474755A (en) Abnormal phone active predicting method and system based on sequence study and integrated study
CN110139288B (en) Network communication method, device, system and recording medium
CN114449106B (en) Method, device, equipment and storage medium for identifying abnormal telephone number
CN116418915A (en) Abnormal number identification method, device, server and storage medium
CN108898167A (en) It breaks one's promise the display methods and device of number
CN109978302A (en) A kind of credit-graded approach and equipment
CN108564380B (en) Telecommunication user classification method based on iterative decision tree
CN113988190A (en) Customer intention analysis method, apparatus, device and storage medium
CN110442799B (en) Scheme pushing method, device and equipment based on data management platform
CN109583203B (en) Malicious user detection method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180824

RJ01 Rejection of invention patent application after publication