CN108449482A - The method and system of Number Reorganization - Google Patents
The method and system of Number Reorganization Download PDFInfo
- Publication number
- CN108449482A CN108449482A CN201810135848.3A CN201810135848A CN108449482A CN 108449482 A CN108449482 A CN 108449482A CN 201810135848 A CN201810135848 A CN 201810135848A CN 108449482 A CN108449482 A CN 108449482A
- Authority
- CN
- China
- Prior art keywords
- telephone number
- score value
- label
- threshold
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/66—Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
- H04M1/667—Preventing unauthorised calls from a telephone set
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Arrangements for supervision, monitoring or testing
- H04M3/2281—Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Technology Law (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a kind of method and systems of Number Reorganization.Wherein, this method includes:Obtain the characteristic information of telephone number;Based on number machine learning model, the label score value of telephone number is determined according to characteristic information, wherein label score value is labeled correct probability for characterizing telephone number;Whether judge mark score value is more than or equal to first threshold;In the case where marking score value to be more than or equal to first threshold, the label information of telephone number is exported.The present invention solves the scheme for realizing Number Reorganization by third party application, by user's active flag in the prior art, leads to the technical problem that the quality of data is poor.
Description
Technical field
The present invention relates to the communications fields, in particular to a kind of method and system of Number Reorganization.
Background technology
In recent years, popularizing with smart mobile phone technology, mobile phone plays irreplaceable in people’s lives and work
Role.Number swindle, short message fraud cause huge property loss to user.Recently, telecommunications overseas is being hit by Ministry of Industry and Information
In the new regulation of swindle publication, explicitly indicate that intercept to pretending to be the phone of the number of changing overseas of public security organs to realize all before the end of the year,
And it researchs and solves and does not show the measures such as number.Along with perpetrator usually hides oneself overseas, not only allow ordinary user that can not take precautions against
It is cheated, while also clearing up a cace to relevant department and bringing difficulty.Only 2016 first half of the year overseas throw the domestic number of changing swindle electricity into
Words just have 4,700,000,000, and every year overseas by economic loss caused by the number of changing phone implementation telecommunication fraud, it has been more than 10,000,000,000
Member.Thus, effectively the problems such as solution telecommunication fraud, short message fraud, is extremely urgent.
Currently, identifying that field, existing solution are defended by third-party APP, safety in telephone fraud
Scholar (for example, Tencent security guard) etc. realizes.This scheme is since number data amount is limited, to multi-platform multi-field branch
It holds, it is perfect not enough, there is larger impact for the important indicator discrimination of data assessment.Secondly as relying on user actively
Marking telephone number so that number mark data, which generate, has certain randomness, cannot rely on effective technological means, because
And it is difficult to ensure that the quality of flag data.In addition, need to be mounted so that third party APP, greatly increases the door of user installation
Sill, user's conversion ratio be not high;User experience is reduced by user's active flag, the overload of information flow can bring additional be stranded
It disturbs, and leaves bad public praise.
In for the above-mentioned prior art Number Reorganization is realized by third party application, by user's active flag
Scheme leads to the problem that the quality of data is poor, and currently no effective solution has been proposed.
Invention content
An embodiment of the present invention provides a kind of method and systems of Number Reorganization, at least to solve in the prior art by
Tripartite's application program, the scheme that Number Reorganization is realized by user's active flag, lead to the technical problem that the quality of data is poor.
One side according to the ... of the embodiment of the present invention provides a kind of method of Number Reorganization, including:Obtain telephone number
Characteristic information;Based on number machine learning model, the label score value of telephone number is determined according to characteristic information, wherein label
Score value is labeled correct probability for characterizing telephone number;Whether judge mark score value is more than or equal to first threshold;It is marking
In the case that score value is more than or equal to first threshold, the label information of telephone number is exported.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of system of Number Reorganization, including:At least one visitor
Family end equipment, for send number inquiry request, wherein number inquiry request include:At least one telephone number;Server,
It is communicated at least one client device, the characteristic information for obtaining telephone number, is based on number machine learning model, according to
Characteristic information determines whether the label score value of telephone number, judge mark score value are more than or equal to first threshold, and in label score value
In the case of more than or equal to first threshold, the label information of telephone number is exported, wherein label score value is for characterizing telephone number
It is labeled correct probability.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of device of Number Reorganization, including:Acquiring unit,
Characteristic information for obtaining telephone number;Determination unit is determined for being based on number machine learning model according to characteristic information
The label score value of telephone number, wherein label score value is labeled correct probability for characterizing telephone number;Judging unit is used
Whether it is more than or equal to first threshold in judge mark score value;First execution unit, for being more than or equal to the first threshold in label score value
In the case of value, the label information of telephone number is exported.
Another aspect according to the ... of the embodiment of the present invention, additionally provides a kind of storage medium, and storage medium includes the journey of storage
Sequence, wherein the method that program executes a kind of above-mentioned Number Reorganization.
Another aspect according to the ... of the embodiment of the present invention additionally provides a kind of processor, and processor is used to run program,
In, program run when execute a kind of above-mentioned Number Reorganization method.
In embodiments of the present invention, pass through the characteristic information of acquisition telephone number;Based on number machine learning model, according to
Characteristic information determines the label score value of telephone number, wherein label score value is labeled correct probability for characterizing telephone number;
Whether judge mark score value is more than or equal to first threshold;In the case where marking score value to be more than or equal to first threshold, phone is exported
The label information of number has reached and machine learning is applied to Number Reorganization to improve the mesh of the accuracy of telephone number identification
, occur to realize the phenomenon that reducing fraudulent call and fraud text message, and improve the technique effect of user experience, in turn
It solves the scheme for realizing Number Reorganization by third party application, by user's active flag in the prior art, causes
The poor technical problem of the quality of data.
Description of the drawings
Attached drawing described herein is used to provide further understanding of the present invention, and is constituted part of this application, this hair
Bright illustrative embodiments and their description are not constituted improper limitations of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is a kind of method flow diagram of Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 2 is a kind of method flow diagram of optional Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 3 is according to the ... of the embodiment of the present invention a kind of preferably using the machine learning progress online exchange architecture of Number Reorganization
Schematic diagram;
Fig. 4 is a kind of machine learning model schematic diagram being preferably used in Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 5 is a kind of block mold of optionally machine learning based on convolutional neural networks according to the ... of the embodiment of the present invention
Schematic diagram;
Fig. 6 is a kind of system schematic of Number Reorganization according to the ... of the embodiment of the present invention;
Fig. 7 is a kind of schematic device of Number Reorganization according to the ... of the embodiment of the present invention.
Specific implementation mode
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention
Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people
The every other embodiment that member is obtained without making creative work should all belong to the model that the present invention protects
It encloses.
It should be noted that term " first " in description and claims of this specification and above-mentioned attached drawing, "
Two " etc. be for distinguishing similar object, without being used to describe specific sequence or precedence.It should be appreciated that using in this way
Data can be interchanged in the appropriate case, so as to the embodiment of the present invention described herein can in addition to illustrating herein or
Sequence other than those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover
It includes to be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment to cover non-exclusive
Those of clearly list step or unit, but may include not listing clearly or for these processes, method, product
Or the other steps or unit that equipment is intrinsic.
According to embodiments of the present invention, a kind of embodiment of the method for Number Reorganization is provided, it should be noted that in attached drawing
The step of flow illustrates can execute in the computer system of such as a group of computer-executable instructions, although also,
Logical order is shown in flow chart, but in some cases, it can be to execute shown different from sequence herein or retouch
The step of stating.
Fig. 1 is a kind of method flow diagram of Number Reorganization according to the ... of the embodiment of the present invention, as shown in Figure 1, this method includes
Following steps:
Step S102 obtains the characteristic information of telephone number.
As a kind of optional embodiment, above-mentioned telephone number can be for telephone set (including fixed-line telephone and mobile electricity
Words) between communicated and the number that sets, can be domestic call, can also be overseas calls, be the country in telephone number
In the case of phone, including but not limited to China Mobile, any one operator of China Unicom and China Telecom provide phone
Number.Features described above information can be the information for the label information for determining above-mentioned telephone number chosen, including but unlimited
In the type of number, number length, incoming call frequency, exhalation frequency, average call duration, averagely breathe out duration, average incoming call duration
Etc. information.
Step S104 is based on number machine learning model, the label score value of telephone number is determined according to characteristic information,
In, label score value is labeled correct probability for characterizing telephone number.
As a kind of optional embodiment, above-mentioned number machine learning model can be instructed by various machine learning algorithms
The label information of the telephone number for identification got and corresponding label score value, wherein label information includes but is limited to
It is following any one:Ad promotions, fraudulent call, harassing call, Courier Service etc..Above-mentioned label score value can pass through number
The score value that machine learning model assesses the label information of telephone number, range can be any one in 0~100
Score value.
Optionally, above-mentioned number machine learning model is using any one following machine learning algorithm:Random forests algorithm,
Algorithm of support vector machine, convolutional neural networks algorithm, Rogers spy's regression algorithm.
In a kind of optional embodiment, the label of telephone number can be determined according to the following characteristic information of telephone number
Score value:Incoming call frequency, average call duration, averagely breathes out duration, average incoming call duration at exhalation frequency.
Whether step S106, judge mark score value are more than or equal to first threshold.
As a kind of optional embodiment, above-mentioned first threshold can be pre-set labeled for characterizing telephone number
Correct minimum mark score value, for example, above-mentioned first threshold can be 70 points.Based on number machine learning model, according to feature
After information determines the label score value of telephone number, it can be determined that whether the label score value of the telephone number is more than or equal to first threshold
(for example, 70 points), if the label score value of telephone number is more than or equal to first threshold, then it is assumed that the labeled accuracy of the phone
It is relatively high;, whereas if the label score value of telephone number is less than first threshold, then it is assumed that the labeled accuracy of the phone is compared
It is low.
Step S108 exports the label information of telephone number in the case where marking score value to be more than or equal to first threshold.
It is easy it is noted that in the prior art, by user directly by third party application (for example, various safety are defended
Scholar) telephone number is marked, it is this completely may be not necessarily accurate by the telephone number that user subjectivity marks, and this Shen
Please scheme disclosed in above-described embodiment, according to the characteristic information of the telephone number got come the label information to telephone number into
Row marking determines whether to export the label of the telephone number according to the corresponding label score value of the label information of each telephone number
Information can further increase telephone number and be labeled correct probability.
From the foregoing, it will be observed that in the above embodiments of the present application, after receiving user to the inquiry request of a certain telephone number,
The characteristic information for the telephone number asked in inquiry request, and the number machine learning mould obtained based on advance training can be obtained
Type determines the label score value of the telephone number, only the label score value of the telephone number according to the characteristic information of the telephone number
After first threshold, the label information of the telephone number is just exported, has reached and machine learning is applied to Number Reorganization
To improve the purpose of the accuracy of telephone number identification, occur to realize the phenomenon that reducing fraudulent call and fraud text message,
And the technique effect of user experience is improved, and then solve in the prior art by third party application, by using householder
It is dynamic to mark to realize the scheme of Number Reorganization, lead to the technical problem that the quality of data is poor.
It should be noted that since the telephone number received may be marked telephone number, it is also possible to not mark
The telephone number of note, as shown in Fig. 2, being based on number machine learning model, is believed as a kind of optional embodiment according to feature
Breath determines the label score value of telephone number, may include steps of:
Step S202 judges whether telephone number is marked number, wherein marked number is for characterizing telephone number
Label information is carried;
Step S204 is based on number machine learning model, according to feature in the case where telephone number is marked number
Information determines that telephone number is marked as the label score value of label information.
Optionally, as shown in Fig. 2, the above method can also include:Step S206 is not marked number in telephone number
In the case of, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information, and be marked as
The label score value of label information.
Optionally, the above method can also include the following steps:Step S110 is less than the feelings of first threshold in label score value
Under condition, forbid the label information for exporting telephone number.
It should be noted that in the case where the label score value of telephone number is less than first threshold, then the telephone number
Label information is not necessarily accurate, therefore, it is possible to forbid exporting the label information of the telephone number.Optionally, as a kind of optional
Embodiment, mark score value be less than first threshold in the case of, forbid export telephone number label information, may include
Following steps:
Whether step 1, judge mark score value are less than second threshold, wherein second threshold is less than first threshold;
Step 2 changes the label information of telephone number in the case where marking score value to be less than or equal to second threshold.
Specifically, in above-mentioned steps, second threshold can be used for characterizing a label of the labeled mistake of telephone number
Score value, second threshold is less than first threshold, for example, second threshold can be 30 points.Thus, it is small in the label score value of telephone number
In the case of second threshold, it may be determined that the telephone number is labeled mistake, needs the label information for removing the telephone number,
Or the label information for being revised as determining based on above-mentioned number machine learning model by the label information of the telephone number.
Based on any one of the above optional embodiment, as a kind of optional embodiment, in the spy for obtaining telephone number
Before reference breath, the above method can also include:Receive the number inquiry request that at least one client device uploads, wherein
Number inquiry is asked:At least one telephone number.
Optionally, it is based on above-described embodiment, as an alternative embodiment, being uploaded receiving at least one client
Number inquiry request after, the above method can also include:It is defeated at least one client device by online recognition engine
Go out the label information of telephone number.
Embodiment as one preferred, Fig. 3 be it is according to the ... of the embodiment of the present invention it is a kind of preferably using machine learning into
The online exchange architecture schematic diagram of row Number Reorganization, as shown in figure 3, machine learning mainly has two big functions on Number Reorganization:One,
Verify data with existing;Two, unidentified data are prejudged.The request that data source is initiated in client, client send out request of data
Online recognition engine is given, by the processing of big data computing platform and original system data platform, result is exported to online
Identify engine.Online recognition engine exports result to be shown to client.In addition, for the data of user's mark, utilize
Machine learning is given a mark, such as 10086 are labeled as " harassing call ", then system can be analyzed according to 10086 number feature
10086 be the score of " harassing call " label.When the label score value of telephone number less than second threshold (for example, 30 points) when
It waits, system thinks that user is inaccurate to the label of this telephone number, when the label score value of telephone number is higher than the first threshold
When being worth (for example, 70 points), system thinks that user to the label of this telephone number is correct.When the label of telephone number
Score value be more than second threshold (for example, 30 points), and less than first threshold (for example, 70 points) when, then not to client export
The label information of the telephone number, using the label information as a part for database.
Optionally, a kind of optional embodiment, Fig. 4 are that one kind according to the ... of the embodiment of the present invention is preferably used in Number Reorganization
Machine learning model schematic diagram, as shown in figure 4, interface library is given a mark to each number, is gone according to the differentiation logic of system on line
It verifies the label of marked number and marking is marked to unmarked number.After the request of client is initiated, pass through mobile phone
The telephony interface of manufacturer sends a request to on-line checking interface, and all kinds of numbers are arranged according to number library inquiry logic later
Threshold parameter judged, when higher than first threshold, export the label of telephone number.When less than second threshold, remove
The label of telephone number.
It should be noted that when building above-mentioned number machine learning model for Number Reorganization, may be used at random
Forest (random forest), SVM (support vector machines), CNN (convolutional neural networks), Logistic regression (sieve
Jie Site return) etc. any one machine learning algorithm, machine learning model effect it is as follows:
(1) real class rate (true positive rate, TPR), calculation formula is:TPR=TP/ (TP+FN) is used for table
The positive example that sign grader is identified accounts for the ratio of all positive examples.
(2) negative and positive class rate (false positive rate, FPR), calculation formula is:FPR=FP/ (FP+TN), is used for
Characterization grader misdeems that the negative example for positive class accounts for the ratio of all negative examples.
Generally, KS values are bigger, indicate that model can be bigger by the separated degree of positive and negative classification.KS>0.2 indicates
Model has preferable forecasting accuracy.Statistical result as shown in Table 1 indicate KS values it is maximum be random forest (random
Forest KS values) are 0.59, and the followed by KS values of support vector machines (SVM) are 0.56, can choose KS values and TPR values are higher
Value, be marked classification artificial accuracy verification.
Machine learning model | KS values | TPR | FPR |
Random forests algorithm | 0.51 | 79.38% | 20.92% |
Algorithm of support vector machine | 0.56 | 79.33% | 24.38% |
Convolutional neural networks algorithm | 0.58 | 80.20% | 24.47% |
Rogers spy's regression algorithm | 0.44 | 67.54% | 23.06% |
It should be noted that in machine learning, sample can be generally divided into independent three parts:Training set (train
Set), verification collection (validation set) and test set (test set).Wherein, training set is used for estimating model, verification collection
For determining the parameter of network structure or Controlling model complexity, and the model that test set then examines final choice optimal
How is performance.Since training set is for establishing model, training set accounts for the 50% of total sample, and it is other respectively account for 25%, three parts are all
It is to be randomly selected from sample.Therefore, the characteristic of machine learning depends on the data accuracy of training set.In verification number
Accuracy rate above, the data of label are truer, and the parameter of machine learning model is closer to actual value.But unlike that common
The required data set of machine learning model, required data set is to need manually to go to verify its accuracy just to can guarantee herein
The correctness of data set.Therefore there are following two limitations for training set:One, the data of training set are more not enough;Two, training set
Data it is accurate not enough.On the data bulk of training set, we use the classification marker of each data, point as data
Class is handled.
Since the purpose of the model of construction number machine learning is desirable to concentrate from original characteristic, study is gone wrong
Structure and problem essence, certainly the feature picked out at this time should be able to just have better explanation to problem, so special
Levy the target of selection approximately as:
1. improving the accuracy of prediction;
2. constructing faster, lower prediction model is consumed;
3. can model be better understood from and be explained.
The selection of feature is very crucial for the model training of machine learning.More feature descriptions are it is meant that have
It is more the information that grader can use, but it is not intended that feature is The more the better, more features means to reach
To the reduction of convergent speed, more redundancy feature can be followed by caused.Feature Selection has following three points IIS laws:
I:Informative both contains effective information
I:Independent was both independent from each other between feature
S:Simple, both information must be easy to extract it is readily comprehensible
To sum up, incoming call frequency (incoming frequency), exhalation frequency (outcoming are chosen in Number Reorganization
Frequency), average call duration (average talk time) averagely breathes out duration (average outcoming
Time) features such as average incoming call duration (average incoming time) are as the aspect of model.
Embodiment as one preferred, comprehensive KS and TPR is in random forest, support vector machines, CNN and Logistic's
Performance, can choose machine learning models of the CNN as Number Reorganization.
Below by taking CNN as an example, process is established introduce number machine learning model.
First, table 2 show the professional term that CNN is related to.
Fig. 5 is a kind of block mold of optionally machine learning based on convolutional neural networks according to the ... of the embodiment of the present invention
Schematic diagram, as shown in figure 5, including following several parts:
(1) data import (import data)
Using the import functions of python, the required training set of Number Reorganization is imported.This partial data comes from Thailand
The data for the single classification that the database of enlightening bear movement is accumulated.Wherein, the code that data import is realized as follows:
#Import data
Rec, rec_col, lab, lab_col=read_train_file (FLAGS.data_dir)
Rec_nd=numpy.array (rec)
Lab_nd=numpy.array (lab)
#print(type(rec),rec[0])
#print(type(lab),lab[0])
#print(type(rec_nd),len(rec_nd),rec_nd[0])
#print(type(lab_nd),len(lab_nd),lab_nd[0])
#sys.exit()
Rec_test, rec_col_test, lab_test, lab_col_test=read_test_file
(FLAGS.data_dir)
Rec_test_nd=numpy.array (rec_test)
Lab_test_nd=numpy.array (lab_test)
(2) node (create op) is created
By exporting classification for input picture and target and creating node, to start to build calculating figure.Pass through weight W and biasing
B, to build mapping relations between the two, x and y_ here are not to be specifically worth.During initialization, occupy-place is used
It accords with placeholder and creates the rec_col dimensional vectors that a data type is float32.None indicates that its value size is indefinite,
Herein as first dimension values, to refer to the size of batch, imply that the quantity of x is indefinite.Similarly, in y_, None is indicated
Its value size is indefinite, herein as first dimension values, to refer to the size of batch, implies that the quantity of y_ is indefinite, lac_
Col is the value of second dimension, and size lac_col, None and lac_col collectively form the bivector of biasing b.shape
Parameter be optional, but have the presence of shape that can be captured automatically because wrong caused by data dimension is inconsistent
Accidentally.Wherein, the code realization for creating node is as follows:
(3) first layer convolution (First Convolutional Layer)
First layer convolution kernel is 5*5, and RGB channel number is 1, is exported as the data of 32 16*16.X_image and weights
Vectorial W_conv1 carries out convolution, in addition bias term b_conv1, then applies ReLU activation primitives, finally carries out the pond of 2*2
(max pooling)。
(4) second layer convolution (Second Convolutional Layer)
Second layer convolution kernel is 5*5, and RGB channel number is 1.X_image and weight vector W_conv1 are carried out convolution, added
Then upper bias term b_conv1 applies ReLU activation primitives, finally carry out pond (max pooling), export as 64 16*
16 tensor.
(5) full articulamentum (Densely Connected Layer)
Data size is set to 1x1, a full articulamentum for there are 1024 neurons is added, for handling data.We
Tensor reshape the output of pond layer is vectorial at some, is multiplied by weight matrix, in addition biasing, is then used for ReLU letters
Number (activation primitive rectified linear unites).
(6) over-fitting layer (Dropout Layer) is prevented
In order to reduce over-fitting, dropout (preventing over-fitting layer) is added before output layer.With one
Placeholder (placeholder) represents the output of a neuron.Dropout can be enabled in the training process in this way,
Dropout is closed in test process.Dropout is operated other than it can shield the output of neuron, can also automatically process nerve
The scale (scale) of first output valve.So with the update and processing that can not have to consider scale (scale) when dropout.
(7) output layer (Readout Layer)
Traditional machine learning model logestic is the special case of softmax (and if only if only there are two classify
It is small), softmax is the more classification problems of processing, adds one softmax layers herein and (makees Drop layers of data for exporting
To input x, the data Y of output layer is obtained by Y=wx+b).
(8) training pattern and assessment models (Trian The Model&Value the Model)
In order to be trained and assess, using with single layer SoftMax neural network models, with more complicated ADAM optimize
Device come do gradient decline optimization, additional parameter keep_prob is added in feed_dict to control dropout ratios.So
Every 100 iteration export a daily record afterwards.
Machine learning is applied to Number Reorganization field by the scheme that the above embodiments of the present application provide, and is Number Reorganization
Development opens new road.Application of the machine learning model in number field has the advantage that:
Process firstly, for production label is greatly advantageous so that the process of production is not only to depend on platform
User's active flag and third-party platform provide, significantly improve data production efficiency.
Secondly, machine learning plays the role of verifying label in label production process, and such a reaction type is adjusted
The stability of one system is highly desirable, the quality and correctness of data are improved.
In addition, application of the machine learning in Number Reorganization field, can adequately excavate the value of data and be used.
According to embodiments of the present invention, a kind of system embodiment of the method for realizing above-mentioned Number Reorganization is additionally provided,
Fig. 6 is a kind of system schematic of Number Reorganization according to the ... of the embodiment of the present invention, as shown in fig. 6, the system includes:It is at least one
Client device 601 and server 603.
Wherein, at least one client device 601, for sending number inquiry request, wherein number inquiry request bag
It includes:At least one telephone number;
Server 603 is communicated at least one client device, the characteristic information for obtaining telephone number, based on number
Code machine learning model, determines the label score value of telephone number according to characteristic information, and whether judge mark score value is more than or equal to the
One threshold value, and in the case where marking score value to be more than or equal to first threshold, export the label information of telephone number, wherein label
Score value is labeled correct probability for characterizing telephone number.
From the foregoing, it will be observed that in the above embodiments of the present application, client device 601 sends any one electricity to server 603
The number inquiry request of number is talked about, server 603 can be obtained and be looked into after receiving user to the inquiry request of the telephone number
The characteristic information for the telephone number asked in request, and the number machine learning model obtained based on advance training are ask, according to this
The characteristic information of telephone number determines the label score value of the telephone number, and only the label score value of the telephone number is more than or equal to the
After one threshold value, the label information of the telephone number is just exported, has reached and machine learning is applied to Number Reorganization to improve phone
The purpose of the accuracy of Number Reorganization occurs to realize the phenomenon that reducing fraudulent call and fraud text message, and improves and use
The technique effect of family experience, and then solve and come in fact by third party application, by user's active flag in the prior art
The scheme of existing Number Reorganization, leads to the technical problem that the quality of data is poor.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to judge whether telephone number is marked number,
Wherein, marked number has carried label information for characterizing telephone number;The case where telephone number is marked number
Under, it is based on number machine learning model, determines that telephone number is marked as the label score value of label information according to characteristic information.
In a kind of optional embodiment, above-mentioned server 603 be additionally operable to be not in telephone number marked number feelings
Under condition, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information, and be marked as marking
The label score value of information.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to the case where marking score value to be less than first threshold
Under, forbid the label information for exporting telephone number.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to whether judge mark score value is less than second threshold,
Wherein, second threshold is less than first threshold;In the case where marking score value to be less than or equal to second threshold, the mark of telephone number is changed
Remember information.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to receive what at least one client device uploaded
Number inquiry ask, wherein number inquiry request include:At least one telephone number.
In a kind of optional embodiment, above-mentioned server 603 is additionally operable to through online recognition engine at least one visitor
The label information of family end equipment output telephone number.
Based on the optional system embodiment of any one of the above, as a kind of optional embodiment, above-mentioned telephone number
Feature includes at least one following:Incoming call frequency, exhalation frequency, average call duration, when averagely breathing out duration, average incoming call
It is long.
Based on the optional system embodiment of any one of the above, as a kind of optional embodiment, above-mentioned number engineering
Model is practised using any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks are calculated
Method, Rogers spy's regression algorithm.
According to embodiments of the present invention, a kind of device embodiment of the method for realizing above-mentioned Number Reorganization is additionally provided,
Fig. 7 is a kind of schematic device of Number Reorganization according to the ... of the embodiment of the present invention, as shown in fig. 7, the device includes:Acquiring unit
701, determination unit 703, judging unit 705 and the first execution unit 707.
Wherein, acquiring unit 701, the characteristic information for obtaining telephone number;
Determination unit 703 determines the label point of telephone number according to characteristic information for being based on number machine learning model
Value, wherein label score value is labeled correct probability for characterizing telephone number;
Whether judging unit 705 is more than or equal to first threshold for judge mark score value;
First execution unit 707, in the case where marking score value to be more than or equal to first threshold, exporting telephone number
Label information.
Herein it should be noted that above-mentioned acquiring unit 701, determination unit 703, judging unit 705 and first execute list
Member 707 corresponds to the step S102 to S108 in embodiment of the method, the example and answer that above-mentioned module and corresponding step are realized
It is identical with scene, but it is not limited to above method embodiment disclosure of that.It should be noted that above-mentioned module is as device
A part can execute in the computer system of such as a group of computer-executable instructions.
From the foregoing, it will be observed that in the above embodiments of the present application, after receiving user to the inquiry request of a certain telephone number,
The characteristic information for the telephone number asked in inquiry request is obtained by acquiring unit 701, and is based in advance by determination unit 703
The number machine learning model that first training obtains determines the label point of the telephone number according to the characteristic information of the telephone number
Value judges whether the label score value of the telephone number is more than or equal to first threshold by judging unit 705, list is executed by first
Member 707 just exports the label letter of the telephone number in the case where the label score value of the telephone number is more than or equal to first threshold
Breath has achieved the purpose that machine learning being applied to Number Reorganization to improve the accuracy of telephone number identification, to realize
The phenomenon that reducing fraudulent call and fraud text message occurs, and improves the technique effect of user experience, and then solves existing skill
The scheme for realizing Number Reorganization in art by third party application, by user's active flag, causes the quality of data poor
The technical issues of.
In a kind of optional embodiment, above-mentioned determination unit includes:First judgment module, for judging that telephone number is
No is marked number, wherein marked number has carried label information for characterizing telephone number;First determining module,
For in the case where telephone number is marked number, being based on number machine learning model, phone being determined according to characteristic information
Number is marked as the label score value of label information.
In a kind of optional embodiment, above-mentioned determination unit further includes:Second determining module, for telephone number not
In the case of being marked number, it is based on number machine learning model, the label information of telephone number is determined according to characteristic information,
And it is marked as the label score value of label information.
In a kind of optional embodiment, above-mentioned apparatus further includes:Second execution unit, for being less than the in label score value
In the case of one threshold value, forbid the label information for exporting telephone number.
In a kind of optional embodiment, above-mentioned second execution unit includes:Second judgment module, for judge mark point
Whether value is less than second threshold, wherein second threshold is less than first threshold;Modified module, for being less than or equal in label score value
In the case of second threshold, the label information of telephone number is changed.
In a kind of optional embodiment, above-mentioned apparatus further includes:Receiving unit is set for receiving at least one client
The standby number inquiry request uploaded, wherein number inquiry request includes:At least one telephone number.
In a kind of optional embodiment, above-mentioned apparatus further includes:Output unit, for passing through online recognition engine to extremely
The label information of few client device output telephone number.
Based on the optional device embodiment of any one of the above, as a kind of optional embodiment, above-mentioned telephone number
Feature includes at least one following:Incoming call frequency, exhalation frequency, average call duration, when averagely breathing out duration, average incoming call
It is long.
Based on the optional device embodiment of any one of the above, as a kind of optional embodiment, above-mentioned number engineering
Model is practised using any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks are calculated
Method, Rogers spy's regression algorithm.
According to embodiments of the present invention, a kind of storage medium is additionally provided, which is characterized in that storage medium includes the journey of storage
Sequence, wherein a kind of optional or preferred method of Number Reorganization of any one of program execution above method embodiment.
According to embodiments of the present invention, a kind of processor is additionally provided, which is characterized in that processor is used to run program,
In, execute that any one of above method embodiment is optional or a kind of preferred method of Number Reorganization when program is run.
The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.
In the above embodiment of the present invention, all emphasizes particularly on different fields to the description of each embodiment, do not have in some embodiment
The part of detailed description may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others
Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, for example, the unit division, Ke Yiwei
A kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can combine or
Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module
It connects, can be electrical or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple
On unit.Some or all of unit therein can be selected according to the actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also
It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can be stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or network equipment etc.) execute each embodiment the method for the present invention whole or
Part steps.And storage medium above-mentioned includes:USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can to store program code
Medium.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered
It is considered as protection scope of the present invention.
Claims (10)
1. a kind of method of Number Reorganization, which is characterized in that including:
Obtain the characteristic information of telephone number;
Based on number machine learning model, the label score value of the telephone number is determined according to the characteristic information, wherein described
Label score value is labeled correct probability for characterizing the telephone number;
Judge whether the label score value is more than or equal to first threshold;
In the case where the label score value is more than or equal to the first threshold, the label information of the telephone number is exported.
2. according to the method described in claim 1, it is characterized in that, number machine learning model is based on, according to feature letter
Breath determines the label score value of the telephone number, including:
Judge whether the telephone number is marked number, wherein the marked number is for characterizing the telephone number
Label information is carried;
In the case where the telephone number is marked number, it is based on number machine learning model, according to the characteristic information
Determine that the telephone number is marked as the label score value of the label information.
3. according to the method described in claim 2, it is characterized in that, judge the telephone number whether be marked number it
Afterwards, the method further includes:
In the case where the telephone number is not marked number, it is based on number machine learning model, is believed according to the feature
Breath determines the label information of the telephone number, and is marked as the label score value of the label information.
4. according to the method described in claim 1, it is characterized in that, judging whether the label score value is more than or equal to the first threshold
After value, the method further includes:
In the case where the label score value is less than the first threshold, forbid the label information for exporting the telephone number.
5. according to the method described in claim 4, it is characterized in that, the case where the label score value is less than the first threshold
Under, forbid the label information for exporting the telephone number, including:
Judge whether the label score value is less than second threshold, wherein the second threshold is less than the first threshold;
In the case where the label score value is less than or equal to the second threshold, the label information of the telephone number is changed.
6. according to the method described in claim 1, it is characterized in that, before the characteristic information for obtaining the telephone number, institute
The method of stating further includes:
Receive the number inquiry request that at least one client device uploads, wherein the number inquiry request includes:At least one
A telephone number.
7. according to the method described in claim 6, it is characterized in that, being asked receiving the number inquiry that at least one client uploads
After asking, the method includes:
The label information of the telephone number is exported at least one client device by online recognition engine.
8. method as claimed in any of claims 1 to 7, which is characterized in that the feature of the telephone number includes
It is at least one following:Incoming call frequency, average call duration, averagely breathes out duration, average incoming call duration at exhalation frequency.
9. method as claimed in any of claims 1 to 7, which is characterized in that the number machine learning model is adopted
With any one following machine learning algorithm:Random forests algorithm, algorithm of support vector machine, convolutional neural networks algorithm, Luo Jie
This special regression algorithm.
10. a kind of system of Number Reorganization, which is characterized in that including:
At least one client device, for sending number inquiry request, wherein number inquiry request includes:At least one
A telephone number;
Server communicates, the characteristic information for obtaining the telephone number at least one client device, based on number
Code machine learning model, the label score value of the telephone number is determined according to the characteristic information, judges that the label score value is
It is no to be more than or equal to first threshold, and in the case where the label score value is more than or equal to the first threshold, export the phone
The label information of number, wherein the label score value is labeled correct probability for characterizing the telephone number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810135848.3A CN108449482A (en) | 2018-02-09 | 2018-02-09 | The method and system of Number Reorganization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810135848.3A CN108449482A (en) | 2018-02-09 | 2018-02-09 | The method and system of Number Reorganization |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108449482A true CN108449482A (en) | 2018-08-24 |
Family
ID=63192162
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810135848.3A Pending CN108449482A (en) | 2018-02-09 | 2018-02-09 | The method and system of Number Reorganization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108449482A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109688275A (en) * | 2018-12-27 | 2019-04-26 | 中国联合网络通信集团有限公司 | Harassing call recognition methods, device and storage medium |
CN110086943A (en) * | 2019-04-29 | 2019-08-02 | 北京羽乐创新科技有限公司 | Number monitoring method and device |
CN110113471A (en) * | 2019-04-29 | 2019-08-09 | 北京羽乐创新科技有限公司 | Number monitoring method and device |
CN110519466A (en) * | 2019-08-30 | 2019-11-29 | 北京泰迪熊移动科技有限公司 | A kind of express delivery number identification method, equipment and computer storage medium |
CN110855843A (en) * | 2019-10-12 | 2020-02-28 | 中国平安财产保险股份有限公司 | Method and device for displaying incoming call of virtual number, storage medium and electronic equipment |
CN111131593A (en) * | 2018-11-01 | 2020-05-08 | 百度在线网络技术(北京)有限公司 | Crank call identification method and device |
CN111131629A (en) * | 2019-12-31 | 2020-05-08 | 宇龙计算机通信科技(深圳)有限公司 | Crank call processing method and device, storage medium and terminal |
WO2020134523A1 (en) * | 2018-12-29 | 2020-07-02 | 中兴通讯股份有限公司 | User identification method and device |
CN111582786A (en) * | 2020-04-29 | 2020-08-25 | 上海中通吉网络技术有限公司 | Express bill number identification method, device and equipment based on machine learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050042546A (en) * | 2003-11-03 | 2005-05-10 | 엄장필 | System and method for managing spam mail |
CN104320525A (en) * | 2014-09-19 | 2015-01-28 | 小米科技有限责任公司 | Method and device for identifying telephone number |
CN104378480A (en) * | 2013-11-15 | 2015-02-25 | 上海触乐信息科技有限公司 | Phone number marking method and system |
CN104683538A (en) * | 2015-02-13 | 2015-06-03 | 广州市讯飞樽鸿信息技术有限公司 | Harassment telephone number library construction method and system |
CN104717674A (en) * | 2014-12-02 | 2015-06-17 | 北京奇虎科技有限公司 | Number attribute recognition method and device, terminal and server |
CN105450825A (en) * | 2015-11-24 | 2016-03-30 | 小米科技有限责任公司 | Communication number classification mark method and device |
JP2016058018A (en) * | 2014-09-12 | 2016-04-21 | キヤノン株式会社 | Image processing method, image processing program and image processor |
CN106550356A (en) * | 2016-09-23 | 2017-03-29 | 深圳市金立通信设备有限公司 | A kind of method and its device for determining caller ID type |
-
2018
- 2018-02-09 CN CN201810135848.3A patent/CN108449482A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050042546A (en) * | 2003-11-03 | 2005-05-10 | 엄장필 | System and method for managing spam mail |
CN104378480A (en) * | 2013-11-15 | 2015-02-25 | 上海触乐信息科技有限公司 | Phone number marking method and system |
JP2016058018A (en) * | 2014-09-12 | 2016-04-21 | キヤノン株式会社 | Image processing method, image processing program and image processor |
CN104320525A (en) * | 2014-09-19 | 2015-01-28 | 小米科技有限责任公司 | Method and device for identifying telephone number |
CN104717674A (en) * | 2014-12-02 | 2015-06-17 | 北京奇虎科技有限公司 | Number attribute recognition method and device, terminal and server |
CN104683538A (en) * | 2015-02-13 | 2015-06-03 | 广州市讯飞樽鸿信息技术有限公司 | Harassment telephone number library construction method and system |
CN105450825A (en) * | 2015-11-24 | 2016-03-30 | 小米科技有限责任公司 | Communication number classification mark method and device |
CN106550356A (en) * | 2016-09-23 | 2017-03-29 | 深圳市金立通信设备有限公司 | A kind of method and its device for determining caller ID type |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111131593A (en) * | 2018-11-01 | 2020-05-08 | 百度在线网络技术(北京)有限公司 | Crank call identification method and device |
CN109688275A (en) * | 2018-12-27 | 2019-04-26 | 中国联合网络通信集团有限公司 | Harassing call recognition methods, device and storage medium |
WO2020134523A1 (en) * | 2018-12-29 | 2020-07-02 | 中兴通讯股份有限公司 | User identification method and device |
CN110086943A (en) * | 2019-04-29 | 2019-08-02 | 北京羽乐创新科技有限公司 | Number monitoring method and device |
CN110113471A (en) * | 2019-04-29 | 2019-08-09 | 北京羽乐创新科技有限公司 | Number monitoring method and device |
CN110113471B (en) * | 2019-04-29 | 2021-08-31 | 北京羽乐创新科技有限公司 | Number monitoring method and device |
CN110519466A (en) * | 2019-08-30 | 2019-11-29 | 北京泰迪熊移动科技有限公司 | A kind of express delivery number identification method, equipment and computer storage medium |
CN110855843A (en) * | 2019-10-12 | 2020-02-28 | 中国平安财产保险股份有限公司 | Method and device for displaying incoming call of virtual number, storage medium and electronic equipment |
CN111131629A (en) * | 2019-12-31 | 2020-05-08 | 宇龙计算机通信科技(深圳)有限公司 | Crank call processing method and device, storage medium and terminal |
CN111582786A (en) * | 2020-04-29 | 2020-08-25 | 上海中通吉网络技术有限公司 | Express bill number identification method, device and equipment based on machine learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108449482A (en) | The method and system of Number Reorganization | |
CN107306306A (en) | Communicating number processing method and processing device | |
CN104683537B (en) | A kind of number mark method and device | |
CN109190351A (en) | On-line signature person identity authorization system based on mobile terminal, device and method | |
CN109688275A (en) | Harassing call recognition methods, device and storage medium | |
CN110493476B (en) | Detection method, device, server and storage medium | |
CN108256591A (en) | For the method and apparatus of output information | |
CN110401780A (en) | A kind of method and device identifying fraudulent call | |
CN111970400B (en) | Crank call identification method and device | |
CN107993142A (en) | A kind of anti-risk of fraud control system of finance | |
CN109840778A (en) | The recognition methods of fraudulent user and device, readable storage medium storing program for executing | |
CN110381218A (en) | A kind of method and device identifying telephone fraud clique | |
CN108153727A (en) | Utilize the method for semantic mining algorithm mark sales calls and the system of improvement sales calls | |
Jamalian et al. | A hybrid data mining method for customer churn prediction | |
CN115034305A (en) | Method, system and storage medium for identifying fraudulent users in a speech network using a human-in-loop neural network | |
CN109474755A (en) | Abnormal phone active predicting method and system based on sequence study and integrated study | |
CN110139288B (en) | Network communication method, device, system and recording medium | |
CN114449106B (en) | Method, device, equipment and storage medium for identifying abnormal telephone number | |
CN116418915A (en) | Abnormal number identification method, device, server and storage medium | |
CN108898167A (en) | It breaks one's promise the display methods and device of number | |
CN109978302A (en) | A kind of credit-graded approach and equipment | |
CN108564380B (en) | Telecommunication user classification method based on iterative decision tree | |
CN113988190A (en) | Customer intention analysis method, apparatus, device and storage medium | |
CN110442799B (en) | Scheme pushing method, device and equipment based on data management platform | |
CN109583203B (en) | Malicious user detection method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180824 |
|
RJ01 | Rejection of invention patent application after publication |