WO2017185862A1 - Method, apparatus and device for identifying malicious call and establishing identification model - Google Patents

Method, apparatus and device for identifying malicious call and establishing identification model Download PDF

Info

Publication number
WO2017185862A1
WO2017185862A1 PCT/CN2017/074169 CN2017074169W WO2017185862A1 WO 2017185862 A1 WO2017185862 A1 WO 2017185862A1 CN 2017074169 W CN2017074169 W CN 2017074169W WO 2017185862 A1 WO2017185862 A1 WO 2017185862A1
Authority
WO
WIPO (PCT)
Prior art keywords
call
sample
parameter
model
feature
Prior art date
Application number
PCT/CN2017/074169
Other languages
French (fr)
Chinese (zh)
Inventor
李靖
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2017185862A1 publication Critical patent/WO2017185862A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/66Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
    • H04M1/663Preventing unauthorised calls to a telephone set
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/436Arrangements for screening incoming calls, i.e. evaluating the characteristics of a call before deciding whether to answer it
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Definitions

  • the present invention relates to the field of communications, and in particular, to a method, an apparatus, a device, and a storage medium for identifying a malicious phone and establishing a recognition model.
  • the method for identifying a malicious phone in the prior art is to use a blacklist technology.
  • the process includes: obtaining a phone number of the current call, determining whether the phone number has a preset blacklist, and if so, determining that the current call is a malicious call.
  • the accuracy of identifying the malicious phone by applying the above method is reduced.
  • the embodiment of the present invention provides a method, device, device, and storage medium for identifying a malicious phone and establishing a recognition model, which can greatly improve the recognition accuracy and response in order to solve at least one problem existing in the prior art. faster.
  • An embodiment of the present invention provides a method for identifying a malicious phone, where the method includes:
  • the first call event is a call event between the first user and the second user
  • the feature parameter includes a parameter for describing a call voice feature, wherein the description call voice
  • the parameter of the feature includes: a waveform feature parameter of the call voice, at least one of a number of the first keyword and a probability in the text corresponding to the call voice;
  • the recognition result of the first call event is output.
  • An embodiment of the present invention provides a method for identifying a malicious phone, where the method includes:
  • a sample type of the sample comprising a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, the negative sample being a sample not belonging to a malicious phone;
  • the feature parameter includes a parameter for describing a call voice feature
  • the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first keyword in the text corresponding to the call voice At least one of a number and a probability;
  • the model is output as a preset recognition model.
  • An embodiment of the present invention provides a device for identifying a malicious phone, where the device includes: a first acquiring unit, an identifying unit, a second acquiring unit, and an output unit, where
  • the first acquiring unit is configured to acquire a feature parameter of the first call event, where the first The call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a call At least one of the number and probability of the first keyword in the text corresponding to the voice;
  • the identifying unit is configured to identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, where the recognition model uses the feature parameter as a classification parameter;
  • the second acquiring unit is configured to acquire a recognition result of the first call event identified by the identification model
  • the first output unit is configured to output a recognition result of the first call event.
  • An embodiment of the present invention provides a device for establishing a malicious model, where the device includes: a second determining unit, a third acquiring unit, a training unit, a determining unit, an adjusting unit, and a second output unit, where
  • the second determining unit is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone ;
  • the third acquiring unit is configured to acquire a feature parameter of the sample, where the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability At least one;
  • the training unit is configured to obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter;
  • the determining unit is configured to determine whether the training result meets a sample type of the sample
  • the adjusting unit is configured to: when the training result does not satisfy the sample type of the sample, Adjusting model parameters of the training model until the training result satisfies a sample type of the sample;
  • the second output unit is configured to output the training model that the training result satisfies the sample type of the sample as a preset recognition model.
  • An embodiment of the present invention provides a device for identifying a malicious phone, where the device includes: a first processor and a first external communication interface, or the device includes a first processor and a display screen;
  • the first processor is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing a call voice feature.
  • a parameter wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability; according to the feature of the first call event Identifying the first call event by using the parameter and the current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model; Outputting the recognition result of the first call event through the first external communication interface, or displaying the recognition result of the first call event through the display screen.
  • An embodiment of the present invention provides a device for establishing a malicious model, where the device includes: a second processor and a second external communication interface, where
  • the second processor is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone
  • Obtaining a feature parameter of the sample the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first keyword in the text corresponding to the call voice At least one of a number and a probability; obtaining a training result output by the training model according to a characteristic parameter of the sample and a set training model, wherein the training model uses the feature parameter as a classification parameter; Whether the training result conforms to the sample type of the sample; if the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample,
  • the second external communication interface outputs the training model in which the training result satisfies
  • the embodiment of the present invention provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to perform the method for identifying a malicious phone and establishing a recognition model provided by the embodiments of the present invention.
  • An embodiment of the present invention provides a method, an apparatus, a device, and a storage medium for identifying a malicious call and establishing a recognition model.
  • the method for identifying a malicious call includes: acquiring a feature parameter of a first call event, where the first call event is a call event between the first user and the second user, the feature parameter includes a parameter for describing a voice feature of the call; the first parameter is determined according to a feature parameter of the first call event and a current preset recognition model The call event is identified, the recognition model takes the feature parameter as a classification parameter, acquires a recognition result of the first call event identified by the recognition model, and outputs a recognition result of the first call event; The parameters of the call voice feature are used as the identification criteria.
  • the malicious call event can be accurately identified, and the recognition result is output to remind the user to be protected. Fraud can greatly reduce the user's economic loss; in addition, the identification model is built. Need to keep on training model for training, the training results continue to adjust the model parameters based on a training model, so that the final training model called quasi-rate optimal sample identification, so to enhance the accuracy of the identification of malicious calls.
  • FIG. 1 is a schematic diagram of an implementation environment according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of an implementation process of a method for identifying a malicious phone according to an embodiment of the present invention
  • 3A is a schematic diagram of a first implementation process of a method for identifying a malicious phone according to an embodiment of the present invention
  • FIG. 3B is a schematic diagram of a second implementation process of a method for identifying a malicious phone according to an embodiment of the present invention
  • 3C is a schematic diagram of a third implementation process of a method for identifying a malicious phone according to an embodiment of the present invention.
  • 3D is a schematic flowchart of a fourth implementation process of a method for identifying a malicious phone according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a device for identifying a malicious phone according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a device for establishing a malicious model according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of hardware components of a device for identifying a malicious phone according to an embodiment of the present invention
  • FIG. 7 is a schematic structural diagram of hardware components of a device for establishing a malicious model according to an embodiment of the present invention.
  • the implementation environment includes: a first terminal 11, a second terminal 12, and a server 13 disposed on the network side; the first terminal 11 and The second terminal 12 exchanges information through a server set in the network, and one of the information exchanges between the first terminal 11 and the second terminal 12 may be a voice call.
  • the embodiments of the present invention relate to a voice call scenario between terminals.
  • the first terminal 11 or the second terminal 12 may be a mobile terminal, such as a mobile phone, a tablet computer, or the like; or may be a fixed terminal such as a fixed telephone.
  • a client having a call function is run in both the first terminal 11 and the second terminal 12.
  • the client can also record the call behavior of the terminal where the terminal is located, such as the call number and the call time of the two parties, and can also cache the current call.
  • the call voice information and the like in this way, the first terminal 11 and the second terminal 12 can determine the call event between the two users in the following embodiments and extract the feature parameters of the call event; the client can be an application client, Can be a web client.
  • the type of the call includes, but is not limited to, any one of a voice call and a video call.
  • the server 13 is provided by an operator, and may be a server, a server cluster composed of multiple servers, or a cloud computing service center.
  • the server 13 is configured to carry control signaling for controlling the user's call, such as call, answer, and reject, and forward the call voice information between the first terminal 11 and the second terminal 12; thus, the first terminal 11 And the second terminal 12 can determine the call event between the two users in the following embodiments and extract the characteristics of the call event. parameter.
  • the first terminal 11 and the second terminal 12 complete the call interaction between the first terminal 11 and the second terminal 12 through a communication connection established with the server 13.
  • the communication connection is usually a TCP/IP (Transmission Control Protocol/Internet Protocol) connection.
  • the embodiments of the present invention provide a method for identifying a malicious phone, which is applied to a computing device, and the function implemented by the method for identifying a malicious phone can be implemented by a processor calling program code in the computing device.
  • the program code can be stored in a computer storage medium.
  • the computing device includes at least a processor and a storage medium.
  • the computing device can be any electronic device capable of information processing, for example, a terminal, a server, where the terminal can be a computing device with a call capability such as a tablet or a mobile phone.
  • FIG. 2 is a schematic flowchart of a method for identifying a malicious phone according to an embodiment of the present invention. As shown in FIG. 2, the method for identifying a malicious phone includes:
  • Step S101 Acquire a feature parameter of the first call event.
  • the first call event is a call event between the first user and the second user
  • the feature parameter includes a parameter for describing a call voice feature
  • the parameter describing the call voice feature includes: a waveform of the call voice At least one of the number of the first keyword and the probability in the text corresponding to the feature parameter and the call voice; since the purpose of the call by the malicious user is generally fraud and promotion, the tone and tone are usually mild, and the term is often used. Very similar, it is possible to analyze the call voice, obtain the feature parameters of the call voice, and identify the malicious call by the parameters describing the call voice feature.
  • the parameter for describing a call voice feature is a first feature parameter
  • the feature parameter further includes a second feature parameter for describing a call behavior feature of the first user.
  • the first implementation manner is: determining a first call event; at this time, correspondingly, acquiring the feature parameter of the first call event includes: extracting a feature parameter of the first call event.
  • the computing device can be implemented as the first terminal 11, the second terminal 12 or the server 13. When the first terminal 11 and the second terminal 12 make a call through the server 13, the first terminal 11, the second terminal 12 or the server 13 can Determining a first call event between the first user and the second user, and extracting feature parameters of the first call event.
  • the second implementation manner is: the computing device is implemented as the first terminal, and the acquiring, by the computing device, the feature parameter of the first call event includes: receiving, by the first terminal, a feature parameter of the first call event sent by the server, where The first terminal corresponds to the first user.
  • the computing device may also be a second terminal. If the computing device is the first terminal or the second terminal, in order to reduce the load of the computing device, the characteristic parameters of the first call event may be extracted on the server 13 side, and then, Transmitting the characteristic parameters of the first call event to the computing device.
  • Step S102 Identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, where the feature model uses the feature parameter as a classification parameter.
  • the feature parameter of the first call event is an input of the recognition model, and the recognition result is an output of the recognition model.
  • the recognition model may include models of various classification algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting Decision Tree (Gradient Boosting Decision Tree). GBDT) and so on.
  • Step S103 Acquire a recognition result of the first call event identified by the recognition model.
  • Step S104 Output a recognition result of the first call event.
  • the first terminal corresponds to the first user
  • the second terminal corresponds to the second user.
  • the outputting the result of the first call event may include: at the computing device Displaying a recognition result of the first call event on the display interface;
  • the outputting the identification result of the first call event may include: the server sending the identification result of the first call event to the first terminal and the second through a communication device (external communication interface) terminal.
  • the parameter describing the characteristics of the call voice is used as the identification standard, and the tone and the term of the malicious user during the malicious call such as promotion and fraud are not arbitrarily changed, so that the malicious call event can be accurately identified, and Output recognition results to alert users to fraud, which can greatly reduce the user's economic loss.
  • an embodiment of the present invention provides a recognition model based on the introduction of machine learning technology.
  • the machine learning refers to a theory of probability theory, statistics, and neural propagation, so that a computer can simulate human learning behavior.
  • To acquire new knowledge or skills reorganize existing knowledge structures to continuously improve their performance.
  • In the initial stage of forming the recognition model it is necessary to manually select as many normal call events and malicious call events as positive and negative samples for machine learning model training.
  • the identification of the malicious phone based on the machine learning model is very complicated, and the malicious user cannot detect and crack by simply adjusting the call number, and the model itself has the function of evolutionary learning, even if the malicious user changes the call mode, Simply re-training the model can identify new malicious call patterns and train them, making it difficult for malicious users to bypass the recognition strategy.
  • an embodiment of the present invention provides a method for establishing a recognition model, which is applied to a computing device, and the function implemented by the method for establishing a recognition model may be implemented by a processor calling program code in a computing device, of course, the program The code can be stored in a computer storage medium, as seen, the computing device includes at least a processor and a storage medium.
  • the method for establishing a recognition model includes:
  • Step S201 determining a sample type of the sample.
  • the sample type includes a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, and the negative sample being a sample not belonging to a malicious phone.
  • the sample type can be determined by means of a manual return visit. For example, by collecting statistics, if the number of unfamiliar calls made by a certain user within a preset time period exceeds a certain threshold, the peer users of the user are manually dialed back. To confirm whether the call event between the two users is a malicious call, if it is a malicious call, determine the call event as a negative sample, and if it is not a malicious call, determine the call event as a positive sample.
  • the determination of the positive and negative samples is purely based on the problem that the sample size is limited and the cost is high. Therefore, the embodiment of the present invention can also automatically extract positive and negative samples by using the program.
  • the determination of the positive samples can be determined by a combination of a rule-based determination method and a statistical-based determination method.
  • the rule-based identification method is used for roughly screening large-scale call events as samples, wherein the rule-based identification method is adopted.
  • a certain rule may be preset to roughly filter the sample, and then the statistic-based identification method is used for screening, for example, the number of times marked as a malicious call and the number of calls of a strange phone exceed a certain threshold (the threshold is statistically The user, and therefore the screening method is called a statistical-based identification method, and then uses the cross-filtering method to clean the sample, and finally obtains a positive sample and a negative sample, wherein there is a certain proportion of normal calls and malicious calls.
  • This ratio is the configuration ratio, and the positive and negative samples obtained in this embodiment are to comply with the configuration ratio.
  • Step S202 Acquire feature parameters of the sample.
  • the feature parameters include parameters used to describe the characteristics of the call voice.
  • the acquiring the feature parameters of the sample includes: acquiring the call voice information of the sample; and extracting the feature parameter from the call voice information of the sample, where the feature parameter includes: a waveform feature parameter of the call voice, and a text corresponding to the call voice. At least one of the number and probability of the first keyword.
  • acquiring waveform feature parameters of the call voice includes: extracting a waveform of the call voice from the call voice information of the sample, the waveform including a time domain waveform or a frequency domain waveform; A waveform characteristic parameter of the waveform, the waveform characteristic parameter including at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a trough position.
  • obtaining the number or probability of the first keyword in the text corresponding to the call voice includes: performing voice recognition on the call voice information of the sample, obtaining text corresponding to the call voice; and extracting a text keyword in the text; Comparing the text keyword with the preset first keyword, determining the number or probability of the first keyword in the text keyword.
  • the purpose of malicious users to conduct calls is generally to scam and sell, so you can count the words often used in fraud and promotion as the first keywords such as "money”, “winning”, “buy”, “bank”, “ Product” and so on.
  • the parameter for describing a call voice feature may be recorded as a first feature parameter, and the feature parameter further includes a second feature parameter for describing a call behavior feature.
  • the suspicious user in the two parties in the sample may be first determined, for example, the first call behavior of the two users of the two parties in the first preset time period is collected; and according to the two users In the first call behavior in the first preset time period, the suspicious users in the two parties are determined; for example, since the malicious users usually frequently talk to the strange telephone, it is possible to count the two parties and the stranger in one day. The number of calls made by the phone, and the number of users who have more calls with strange calls is a suspicious user.
  • the second feature parameter may be a parameter describing a call behavior feature of the non-suspicious user, including: the number of calls marked as a malicious user, the average duration of the call, and the number of calls with the unfamiliar user in the second preset time period, At least one of the number of calls with overseas users.
  • the second feature parameter may also be a parameter describing a call behavior feature of the suspicious user, including: the number of calls marked as a malicious user, the average duration of the call, and the number of calls with the unfamiliar user during the second preset time period. At least one of the average duration of the call, the number of calls to the overseas user, the number of times marked as a malicious user, and the like.
  • one of the training sets for training the recognition model is:
  • the number of calls marked as malicious users under the call behavior feature table shown in Table 1 "The average duration of calls marked as malicious users" "Number of calls with overseas users” "Number of calls with strange users”
  • the "marked condition” is an example of the second characteristic parameter described in the embodiment; the parameter values of each parameter are statistical results in a preset time period, and the preset time period may be the start of the current call event. The day before.
  • the "time domain waveform parameter”, the "frequency domain waveform parameter”, the "number of first keywords in the text corresponding to the call voice”, and the like in the voice feature table shown in Table 1 are the same as described in this embodiment.
  • the first characteristic parameter of the call event; the time domain waveform parameter may include a plurality of parameters (such as peak amplitude value, valley amplitude value, waveform amplitude average, peak position, and trough position, etc.) as described above, and these parameters may form parameters.
  • Vectors such as "Vector 1", “Vector 2", “Vector 3", etc.; frequency domain waveform parameters may also include a variety of parameters as described above, which may form parameter vectors such as “Vector 4", "Vector 5", “ Vector 6" and so on.
  • the malicious call list in Table 1 indicates whether the call event is a malicious call, if it is "Yes”, the sample is a positive sample, and if it is "No", the sample is a negative sample, such as As shown in Table 1, the sample 1 is a positive sample, and the sample 2 and the sample 3 are negative samples.
  • Step S203 Obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter.
  • the training model may include models of various classification algorithms including logistic regression algorithms, support vector machines, gradient elevation decision trees, and the like.
  • Step S204 Determine whether the training result meets the sample type of the sample.
  • Step S205 If the training result does not satisfy the sample type of the sample, adjust the model parameter of the training model until the training result satisfies the sample type of the sample, and the training result satisfies the sample of the sample.
  • the type of training model is output as a preset recognition model.
  • the training model may have multiple, such as a time domain waveform training model, a frequency domain waveform training model, a call behavior training model, etc.
  • the time domain waveform parameter in the sample may be used as an input of a time domain waveform training model, and the frequency is
  • the domain waveform parameter is used as the input of the frequency domain waveform training model
  • the call behavior feature is used as the input of the call behavior training model, etc.
  • the training results of each training model are obtained, as long as the training results of the respective training models satisfy the sample type of the sample,
  • These training models can be output as a preset recognition model.
  • the input of the training model includes the above-mentioned feature parameters, and the feature parameters of each sample are used as input of the training model, and the training model can be obtained from the training model.
  • the training model can be obtained from the training model.
  • the training model obtains the sample type of the sample according to the characteristic parameters of each sample, that is, after the feature parameter of the positive sample is input into the training model, the obtained training result indicates that the sample corresponding to the feature parameter is a positive sample.
  • the obtained training result indicates that the sample corresponding to the feature parameter is a negative sample, and the training result satisfies the training model of the sample type of the sample.
  • the training knot corresponding to each sample If there is a sample type that does not satisfy the sample, that is, after the feature parameter of the positive sample is input into the training model, the obtained training result indicates that the sample corresponding to the feature parameter is a negative sample, or the characteristic parameter of the negative sample is input into the training model, and The training result indicates that the sample corresponding to the feature parameter is a positive sample, and then the model parameters of the training model are adjusted until the training results corresponding to all the samples satisfy the sample type of the sample; and then the adjusted training result is satisfied.
  • the training model of the sample type of the sample is output as a preset recognition model.
  • the feature parameters of the sample include a first feature parameter for describing a call speech feature and a second feature parameter for describing a call behavior feature;
  • the training model includes a first sub-training model and The second sub-training model, the method of establishing the recognition model at this time:
  • Step A1 Identify the sample according to the second feature parameter and the first sub-training model, where the first sub-training model uses the second feature parameter as a classification parameter; acquiring the first sub-child Training a first training result of the sample output by the model; adjusting the model parameter of the first training model until the first training result satisfies the sample when the first training result does not satisfy the sample type of the sample The sample type of the sample;
  • Step A2 Identify the sample according to the third feature parameter and the second sub-training model, where the second sub-training model uses the third feature parameter as a classification parameter, and the third feature parameter is a Determining a second feature parameter or the feature parameter; acquiring a second sub-training result output by the second sub-training model; adjusting the second when the second sub-training result does not satisfy the sample type of the sample Model parameters of the sub-training model until the second training result satisfies the sample type of the sample;
  • Step A3 output, as a preset first sub-recognition model, the first sub-training model that satisfies the sample type of the sample, and the second training result satisfies the sample type of the sample.
  • the second sub-training model is output as a preset second sub-recognition model.
  • the first feature parameter describing the characteristics of the call voice is used to train the training model, and the model parameters of the training model are continuously adjusted according to the training result, so that the final The training model optimizes the call rate for sample identification, thus improving the accuracy of identifying malicious calls.
  • a distinguishing feature of the recognition model adopted by the embodiment of the present invention is that the model can self-evolve, and automatically adjust the model parameters according to the change of the call voice or the call behavior, thereby avoiding the rule-based manual frequent intervention adjustment parameters.
  • an embodiment of the present invention provides a method for identifying a malicious phone, which is applied to a computing device, where the computing device is implemented as a server, and the function implemented by the method for identifying a malicious phone may be invoked by a processor in a server.
  • the program code is implemented.
  • the program code can be stored in a computer storage medium.
  • the server includes at least a processor and a storage medium.
  • FIG. 3A is a schematic flowchart of an implementation of a method for identifying a malicious phone according to an embodiment of the present invention. As shown in FIG. 3A, the method for identifying a malicious phone includes:
  • Step S301 The server determines a first call event, and extracts a feature parameter of the first call event.
  • the first user establishes a communication connection with the second user through the server, thereby implementing a call between the first user and the second user, and the server is configured to carry control signaling for controlling the user's call, such as calling, answering, and rejecting.
  • the signaling is forwarded, and the call voice information between the first terminal 11 and the second terminal 12 is forwarded. Therefore, the server may determine a call event between the first user and the second user and call behavior information of the first user and the second user.
  • the feature parameters include a first feature parameter for describing a call voice feature and a second feature parameter for describing a call behavior feature.
  • the server 13 can forward the call voice information between the first terminal 11 and the second terminal 12, the first terminal corresponds to the first user, and the second terminal corresponds to the second user; therefore, the first session of the first call event is extracted.
  • the feature parameter may include: acquiring call voice information of the first call event; extracting the first feature parameter from call voice information of the first call event, where the first feature parameter includes: waveform feature of the call voice In the text corresponding to the parameter and call voice At least one of the number and probability of the first keyword.
  • the server may extract a waveform of a call voice from the call voice information of the first call event, where the waveform includes a time domain waveform or a frequency domain waveform; and extracting waveform characteristics of the waveform a parameter, the waveform characteristic parameter comprising at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a trough position.
  • the server may also perform voice recognition on the call voice information of the first call event, obtain text corresponding to the call voice, extract a text keyword in the text, and compare the text keyword with a preset first key. a word determining a number or probability of the first keyword in the text keyword.
  • the purpose of a malicious user to make a call is generally to scam and sell, so it is possible to count the words often used in fraud and promotion as the first keywords such as "money”, “winning”, “buy”, “banking ",” “products” and so on.
  • the server may collect the first call behavior of the first user and the second user in the first preset time period; according to the first call of the first user and the second user in the first preset time period Behavior, determining whether the first user is a suspicious user; for example, since a malicious user usually frequently talks to a strange phone, it is possible to count the two parties (first user and second user) and the strange phone in one day. The number of calls, the number of users who have more calls with strange calls is suspicious.
  • the second feature parameter may be a parameter describing a call behavior feature of the non-suspicious user, so if the first user is not a suspicious user, the server is used to describe the first from the call behavior information of the first user.
  • the second characteristic parameter of the call behavior feature of the user where the second feature parameter includes: the number of calls marked as a malicious user, the average duration of the call, the number of calls with the unfamiliar user, and the overseas time in the second preset time period At least one of the number of calls of the user; the second characteristic parameter may be a parameter describing a call behavior characteristic of the suspicious user, and if the first user is a suspicious user, the call behavior of the server from the first user a second feature parameter used in the information to describe a call behavior feature of the first user, the second feature parameter comprising: At least one of the number of calls with the unfamiliar user, the average duration of the call, and the number of calls with the overseas user during the third predetermined time period.
  • Step S302 The server identifies the first call event according to the feature parameter of the first call event and the current preset recognition model, where the feature model uses the feature parameter as a classification parameter.
  • the online model shown in FIG. 3A is the current preset recognition model; the current preset recognition model is established by the server using the method for establishing a recognition model described in the foregoing embodiment.
  • the recognition model includes a first sub-recognition model and a second sub-recognition model
  • step S302 includes the following steps B1-B4:
  • Step B1 Identify the first call event according to the second feature parameter and the first sub-identification model, where the first sub-identification model uses the second feature parameter as a classification parameter.
  • Step B2 Acquire an initial recognition result of the first call event identified by the first sub-identification model.
  • Step B3 Acquire a first feature parameter of the first call event if the initial recognition result satisfies a first preset condition.
  • the initial recognition result satisfies the first preset condition, it indicates that the first call event may be a malicious event, and a subsequent step is needed to further identify the first call event. If the initial recognition result does not satisfy the first preset condition, it indicates that the first call event is not a malicious event, and the process ends.
  • Step B4 Identify, according to the feature parameter of the first call event and the second sub-identification model, the first call event, where the second sub-recognition model uses the feature parameter as a classification parameter; or Determining, according to the first feature parameter of the first call event and the second sub-identification model, the first call event, wherein the second sub-identification model uses the first feature parameter as a classification parameter.
  • the acquiring the identification of the first call event identified by the recognition model includes: obtaining a recognition result of the first call event identified by the second sub-identification model.
  • Step S303 The server acquires a recognition result of the first call event identified by the identification model, and determines a reminding instruction of the first call event according to the recognition result of the first call event.
  • the server determines that the reminding instruction of the first call event is a first reminding instruction, and the first reminding instruction is used to indicate And not outputting, to the terminal, the recognition result of the first call event; the server determining, when the recognition result of the first call event meets the third preset condition, that the reminding instruction of the first call event is the second reminding instruction, The second reminding instruction is configured to send a short message to the first terminal, where the short message carries the identification result of the first call event; and the server identifies that the first call event meets the fourth preset condition Determining, by the third terminal, that the reminding instruction of the first call event is a third reminding instruction, where the third reminding instruction is used to initiate a call to the first terminal, and after the first terminal answers the call to the first The terminal notifies the identification result of the first call event.
  • Step S304 When the first user is not a suspicious user, the server outputs the recognition result of the first call event to the first terminal according to the reminding instruction of the first call event.
  • the first terminal corresponds to the first user
  • the second terminal corresponds to the second user
  • the recognition result identified by the recognition model is the probability that the identified call event is a malicious call
  • the second preset condition is [a, b]
  • the third preset condition is ( b, c]
  • the fourth preset condition is (c, d); assuming that a is 0, b is 10%, c is 50%, and d is 100%.
  • the server determines that the identification result of the first call event meets the second preset condition, indicating that the first call event is a risk-free event, and the server may not remind the user; If the recognition result of the call event is 30%, the server determines that the recognition result of the first call event satisfies the third preset The condition indicates that the first call time is a low-risk event, and the server may send a reminder message to a non-suspect user such as the first user in the first call event, and the content of the short message may be “Dear User, hello, The number of the call with XXXX may be a malicious call, please strengthen the defense, etc.; if the recognition result of the first call event is 60%, the server determines that the recognition result of the first call event satisfies the first The four preset conditions indicate that the first call time is a high-risk event, and the server may initiate a call to a non-suspect user in the first call event, such
  • the server may further increase the number of times the second user is marked as a malicious user by 1 when the recognition result of the first call event satisfies the second preset condition or the third preset condition, so that the second user
  • the server may send the number of times the second user is marked as a malicious user to the other user, alerting the other user to the attention.
  • the second user is a salesperson, and the number of calls with the strange number is many times in this day.
  • the second terminal used by the second user dials the first terminal
  • the first user of the first terminal is promoted to the product.
  • the tone of the call is very mild. I often say “our product XXX” "our product is very good” "original price is XXX”, “Buy now can give you a discount XXX” "Buy our products will not regret" and so on.
  • the server may determine a call event between the first user and the second user, and identify the call event according to the foregoing method, and the final recognition result is that the call event is low risk.
  • the server sends a reminder message to the first user.
  • the first user carefully considers his or her behavior and decides whether to continue communication with the second user to make a purchase or to the first
  • the second user leaks his identity information, etc.; this prevents the first user from being defrauded.
  • Step S305 The server sends the feature parameter of the first call event to an offline model establishing module in the server, where the offline model establishing module uses a feature parameter of the first call event as a feature parameter of the sample.
  • the server may proceed to step S305.
  • the offline model establishing module may add a feature parameter of the first call event to a training set as a feature parameter of a sample.
  • Step S306 The server determines a sample type of the sample.
  • the sample type of the sample (ie, the first call event) can be automatically determined manually or by an offline model building module in the server to determine whether the sample is a positive sample or a negative sample.
  • Step S307 The server obtains a training result output by the training model according to the feature parameter of the sample and the set training model, where the training model uses the feature parameter as a classification parameter; and determines whether the training result meets the sample. a sample type; if the training result does not satisfy the sample type of the sample, adjusting a model parameter of the training model until the training result satisfies a sample type of the sample, and obtaining the training result that satisfies the sample A training model for the sample type.
  • Step S308 The server uses the training model whose training result satisfies the sample type of the sample as a preset recognition model.
  • the offline model shown in FIG. 3A is a training model that satisfies the sample type of the sample, so that the offline model building module can continuously obtain the feature parameters of the call event from the server in which it is located, and take the feature parameters of the call event as The sample is used for machine learning to carry out model training.
  • the model parameters are automatically adjusted according to the change of call voice and call behavior, and the evolution is automatically performed to avoid the rule-based manual frequent intervention adjustment parameters.
  • the currently preset recognition model in the server may also be
  • the first device is configured to be sent to the server by using the method for establishing a recognition model described in the foregoing embodiment; that is, the offline model establishing module in the server is disposed in the first device, and the first device is capable of Other devices (which may be the first terminal or the second terminal) that the server communicates with.
  • the step S305 to the step S308 can also be implemented in the first device.
  • the step S305 includes: the server sending the feature parameter of the first call event to the first device, where the first device sends the first call
  • the feature parameter of the event is used as the feature parameter of the sample, and then steps S306 and S307 are performed, and the training model whose training result satisfies the sample type of the sample is output to the server as the current preset recognition model.
  • the first device is the first terminal 11.
  • the first terminal 11 establishes a recognition model by using the method for establishing a recognition model described in the foregoing embodiment, and then sends the identifier to the server 13; After the first call event performs feature extraction and performs malicious phone identification according to the recognition model, the identification result of the first call event is sent to the first terminal and/or the second terminal.
  • the server 13 sends the feature parameters of the first call event to the first terminal 11 after the feature extraction of the first call event, and the first terminal 11 may The feature parameter is trained as a feature parameter of the sample to establish a current recognition model, and then the updated recognition model is sent to the server.
  • the first device is not the first terminal and the second terminal, but other devices capable of communicating with the server 13.
  • the first device 14 adopts the foregoing embodiment.
  • the identification model is established, and then sent to the server 13; after the server 13 uses the scheme in this embodiment to identify the malicious phone, the identification result of the first call event is sent to the first terminal 11 and/or The second terminal 12.
  • the server 13 sends the feature parameters of the first call event to the first device 14 after the feature extraction of the first call event, and the first device 14 may The feature parameter is used as a feature parameter of the sample to train to establish a current recognition model.
  • the computing device may also be implemented as a first terminal, where The step S304 needs to be replaced by: the first terminal displays the recognition result of the first call event on the display interface of the first terminal.
  • the computing device can also be implemented as a second terminal, and the implementation process is the same as that of the first terminal; for example, the computing device is the first terminal 11, as shown in FIG. 3D, the server adopts the foregoing embodiment.
  • the identification model is established, the identification model is sent to the first terminal 11, and the first terminal 11 extracts the feature parameters of the first call event, and performs malicious phone identification according to the recognition model.
  • the recognition result of the first call event is displayed on the display interface of the first terminal.
  • the first terminal 11 sends the feature parameters of the first call event to the server 13 after performing feature extraction on the first call event, and the server 13 may set the feature parameters of the first call event.
  • Training is established as a feature parameter of the sample to establish a current recognition model.
  • the initial recognition is performed by using the call behavior feature, and when the preliminary recognition result satisfies the first preset condition, the second feature parameter of the call event that satisfies the first preset condition is used for identification, so that Pre-screening part of the call event that does not satisfy the first preset condition can speed up the recognition rate, and finally the identification of the malicious call event must be identified by using the second feature parameter describing the voice feature to ensure the identification of the malicious call event. accuracy.
  • an embodiment of the present invention provides an apparatus for identifying a malicious phone.
  • Each unit included in the device for identifying a malicious phone, and each module included in each unit may be processed by a processor in the device.
  • the implementation can of course also be implemented by logic circuits; in the process of the embodiment, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP) or a field programmable gate array (FPGA). )Wait.
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the device includes a first acquiring unit 401, an identifying unit 402, a second obtaining unit 403, and a first output unit 404, where:
  • the first obtaining unit 401 is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes The parameter describing the call voice feature, wherein the parameter describing the call voice feature comprises: at least one of a waveform feature parameter of the call voice, a number of first keywords in the text corresponding to the call voice, and a probability.
  • the identifying unit 402 is configured to identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, and the recognition model uses the feature parameter as a classification parameter.
  • the second obtaining unit 403 is configured to acquire a recognition result of the first call event identified by the identification model.
  • the first output unit 404 is configured to output a recognition result of the first call event.
  • the first obtaining unit 401 includes: an obtaining module and an extracting module, wherein the acquiring module is configured to acquire call voice information of the first call event; and the extracting module is configured to be from the first call event The feature parameters are extracted from the call voice information.
  • the extracting module is configured to extract a waveform of the call voice from the call voice information of the first call event, where the waveform includes a time domain waveform or a frequency domain waveform; and extract waveform characteristic parameters of the waveform, the waveform feature
  • the parameters include at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a valley position.
  • the extracting module is configured to perform voice recognition on the call voice information of the first call event, obtain text corresponding to the call voice, extract a text keyword in the text, and compare the text keyword with a preset number a keyword determining a number or probability of the first keyword in the text keyword.
  • the parameter for describing a call voice feature is a first feature parameter, and the feature parameter further includes a second feature parameter for describing a call behavior feature of the first user.
  • the device further includes an acquisition unit and a third determining unit, wherein the collecting unit is configured to collect the first user and the second user for a first preset time period a first call behavior; the third determining unit configured to be used according to the first Determining, by the first call behavior of the user and the second user in the first preset time period, whether the first user is a suspicious user; correspondingly, if the first user is not a suspicious user, the second feature parameter includes : at least one of a number of calls marked as a malicious user and an average duration of the call, a number of calls to the unfamiliar user, and a number of calls with the overseas user during the second predetermined time period; if the first user is a suspicious user
  • the second characteristic parameter includes: at least one of a number of conversations with an unfamiliar user, an average duration of the call, and a number of conversations with the overseas user in the third preset time period.
  • the identification model includes a first sub-identification model and a second sub-recognition model
  • the identification unit includes a first identification module and a second identification module, wherein the first identification module And configured to identify the first call event according to the second feature parameter and the first sub-identification model, and obtain an initial recognition result of the first call event identified by the first sub-recognition model
  • the first sub-identification model is configured to use the second feature parameter as a classification parameter
  • the second identification module is configured to: when the initial recognition result satisfies a first preset condition, according to the first call event
  • the feature parameter and the second sub-recognition model identify the first call event, the second sub-recognition model uses the feature parameter as a classification parameter; or, according to the first feature of the first call event
  • the parameter and the second sub-identification model identify the first call event, and the second sub-recognition model uses the first feature parameter as a classification parameter; correspondingly, the first Obtaining module 403, the second
  • the device further includes: a first determining unit, wherein the first determining unit is configured to determine a reminder of the first call event according to the recognition result of the first call event
  • the first output unit is further configured to output a recognition result of the first call event to the terminal according to the reminding instruction of the first call event, where the terminal includes a first terminal corresponding to the first user And a second terminal corresponding to the second user.
  • the first determining unit is configured to satisfy a second recognition result of the first call event Determining, by the preset condition, that the reminding instruction of the first call event is a first reminding instruction, where the first reminding instruction is used to indicate that the recognition result of the first call event is not output to the terminal;
  • the reminder instruction of the first call event is determined as a second reminder instruction, and the second reminder instruction is used to send a short message to the terminal, where the short message carries a recognition result of the first call event;
  • the recognition result of the first call event satisfies a fourth preset condition, determining that the alert command of the first call event is a third alert command, the third alert command Instructing to initiate a call to the terminal, and notifying the first terminal of the recognition result of the first call event after the terminal answers the call;
  • the terminal includes a first terminal corresponding to the first user and a corresponding second a second terminal of the user; correspondingly, the first output unit 404 is configured to output
  • the first output unit 404 is further configured to display a recognition result of the first call event on a display interface of the first terminal, where the first terminal corresponds to the first One user.
  • the apparatus further includes a third output unit, wherein the third output unit is configured to transmit the characteristic parameter of the first call event to the first device.
  • an embodiment of the present invention provides an apparatus for establishing a malicious model, where each unit included in the apparatus for establishing a malicious model, and each module included in each unit can be processed by a processor in the apparatus.
  • the implementation can of course also be implemented by logic circuits; in the process of the embodiment, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP) or a field programmable gate array (FPGA). )Wait.
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA field programmable gate array
  • FIG. 5 is a schematic structural diagram of a device for establishing a malicious model according to an embodiment of the present invention, as shown in FIG. 5
  • the device for establishing a malicious model includes: a second determining unit 501, a third obtaining unit 502, a training unit 503, a determining unit 504, an adjusting unit 505, and a second output unit 506, wherein:
  • the second determining unit 501 is configured to determine a sample type of the sample, where the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is not belonging to a malicious phone. sample.
  • the third obtaining unit 502 is configured to acquire a feature parameter of the sample, where the parameter describing the voice feature of the call includes: a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability At least one of them.
  • the training unit 503 is configured to obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter.
  • the determining unit 504 is configured to determine whether the training result meets the sample type of the sample.
  • the adjusting unit 505 is configured to adjust a model parameter of the training model until the training result satisfies the sample type of the sample when the training result does not satisfy the sample type of the sample.
  • the second output unit 506 is configured to output, as a preset recognition model, a training model in which the training result satisfies the sample type of the sample.
  • the first acquiring unit is further configured to receive a feature parameter of the first call event, and use a feature parameter of the first call event as a feature parameter of the sample.
  • FIG. 6 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the device for identifying a malicious phone includes a first processor 601 and a first external communication interface 602, wherein:
  • the first processor 601 is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing the call voice.
  • the parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability; according to the first call event Identifying the first call event by using a feature parameter and a current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model Transmitting, by the first external communication interface 602, a recognition result of the first call event.
  • the device for identifying a malicious phone may also be implemented as a first terminal or a second terminal.
  • the device for identifying a malicious phone includes a first processor and a display screen, wherein: the first processor is configured to acquire a characteristic parameter of the first call event, the first call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature; according to the first call event Identifying the first call event by using a feature parameter and a current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model Displaying the recognition result of the first call event through the display screen.
  • the display screen is configured to display a recognition result of the first call event.
  • an embodiment of the present invention provides a device for establishing a malicious model, where the device for establishing a malicious model may be implemented as a server, a first terminal, or a second terminal, and FIG. 7 is a method for establishing a malicious model according to an embodiment of the present invention.
  • a second processor 701 and a second external communication interface 702 are included, wherein:
  • the second processor 701 is configured to determine a sample type of the sample, where the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is not belonging to a malicious phone.
  • the second external communication interface 702 makes the training model that the training
  • the foregoing method for identifying a malicious phone and establishing a recognition model is implemented in the form of a software function module, and is sold or used as an independent product, it may also be stored in a computer readable state. In the storage medium.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method for identifying a malicious phone and establishing a recognition model in the embodiment of the present invention. .
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit;
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the foregoing program may be stored in a computer readable storage medium, and when executed, the program includes The foregoing steps of the method embodiment; and the foregoing storage medium includes: a removable storage device, a read only memory (ROM), a magnetic disk, or an optical disk, and the like, which can store program codes.
  • the above-described integrated unit of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a standalone product.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a magnetic disk, or an optical disk.
  • the parameter describing the characteristics of the call voice is used as the identification standard, and the tone and the term of the malicious user during the malicious call such as promotion and fraud are not arbitrarily changed, so that the malicious call event can be accurately identified, and Outputting the recognition result to remind the user from fraud, can greatly reduce the economic loss of the user; in addition, the establishment of the recognition model needs to continuously train the training model, and continuously adjust the model parameters of the training model according to the training result, so as to finally The training model optimizes the call rate for sample identification, thus improving the accuracy of identifying malicious calls.

Abstract

Disclosed are a method, apparatus and device for identifying a malicious call and establishing an identification model, and a storage medium. The method for identifying a malicious call comprises: acquiring a feature parameter of a first call event, wherein the first call event is a call event between a first user and a second user, and the feature parameter comprises a parameter for describing a call voice feature, with the parameter for describing a call voice feature comprising: at least one of a waveform feature parameter of a call voice, and the number and probability of first keywords in a text corresponding to the call voice; according to the feature parameter of the first call event and a current pre-set identification model, identifying the first call event, wherein the identification model uses the feature parameter as a classification parameter; acquiring an identification result of the first call event identified by the identification model; and outputting the identification result of the first call event.

Description

识别恶意电话及建立识别模型的方法、装置、设备Method, device and device for identifying a malicious call and establishing a recognition model
本专利申请要求2016年04月28日提交的中国专利申请号为201610278825.9,申请人为腾讯科技(深圳)有限公司,发明名称为“识别恶意电话及建立识别模型的方法、装置、设备”的优先权,该申请的全文以引用的方式并入本申请中。This patent application claims that the Chinese patent application number submitted on April 28, 2016 is 201610278825.9, and the applicant is Tencent Technology (Shenzhen) Co., Ltd., and the invention is entitled "Recognition of Malicious Telephones and Methods, Devices, and Equipment for Establishing Identification Models". The entire content of this application is incorporated herein by reference.
技术领域Technical field
本发明涉及通信领域,尤其涉及一种识别恶意电话及建立识别模型的方法、装置、设备、存储介质。The present invention relates to the field of communications, and in particular, to a method, an apparatus, a device, and a storage medium for identifying a malicious phone and establishing a recognition model.
背景技术Background technique
通讯技术的高速发展,给人们的工作和日常生活带来了很多的便利,但随之也带来了很多的烦恼,在日常生活中,越来越多的不法分子借助于手机或固定电话等通讯工具进行恶意行为,例如,对他人进行电话诈骗,给他人带来了经济损失等;因此,当用户与陌生电话进行通话时,需要识别陌生电话是否为恶意电话,从而避免用户的经济受到损失。The rapid development of communication technology has brought a lot of convenience to people's work and daily life, but it has also brought a lot of troubles. In daily life, more and more lawless elements rely on mobile phones or landlines. Communication tools carry out malicious acts, such as phone fraud to others, causing economic losses to others; therefore, when a user makes a call with a strange phone, it is necessary to identify whether the strange phone is a malicious call, thereby avoiding the loss of the user's economy. .
现有技术中恶意电话的识别方法是利用黑名单技术,其流程包括:获取当前通话的电话号码,判断该电话号码是否存在预设的黑名单,如果存在,则确定当前通话为恶意电话。但是,随着号码隐藏服务及网络改号技术的出现,应用上述方法进行识别恶意电话的准确性降低。The method for identifying a malicious phone in the prior art is to use a blacklist technology. The process includes: obtaining a phone number of the current call, determining whether the phone number has a preset blacklist, and if so, determining that the current call is a malicious call. However, with the advent of the number hiding service and the network renaming technology, the accuracy of identifying the malicious phone by applying the above method is reduced.
发明内容Summary of the invention
有鉴于此,本发明实施例为解决现有技术中存在的至少一个问题而提供了一种识别恶意电话及建立识别模型的方法、装置、设备、存储介质,能够大幅提高识别准确率,且响应速度更快。 In view of the above, the embodiment of the present invention provides a method, device, device, and storage medium for identifying a malicious phone and establishing a recognition model, which can greatly improve the recognition accuracy and response in order to solve at least one problem existing in the prior art. faster.
本发明的技术方案是这样实现的:The technical solution of the present invention is implemented as follows:
本发明实施例提供一种识别恶意电话的方法,所述方法包括:An embodiment of the present invention provides a method for identifying a malicious phone, where the method includes:
获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;Obtaining a feature parameter of the first call event, the first call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature, wherein the description call voice The parameter of the feature includes: a waveform feature parameter of the call voice, at least one of a number of the first keyword and a probability in the text corresponding to the call voice;
根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;Determining, according to the feature parameter of the first call event and the current preset recognition model, the first call event, wherein the identification model uses the feature parameter as a classification parameter;
获取所述识别模型识别出的所述第一通话事件的识别结果;Obtaining a recognition result of the first call event identified by the identification model;
输出所述第一通话事件的识别结果。The recognition result of the first call event is output.
本发明实施例提供一种识别恶意电话的方法,所述方法包括:An embodiment of the present invention provides a method for identifying a malicious phone, where the method includes:
确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;Determining a sample type of the sample, the sample type comprising a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, the negative sample being a sample not belonging to a malicious phone;
获取样本的特征参数,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;Obtaining a feature parameter of the sample, where the feature parameter includes a parameter for describing a call voice feature, where the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first keyword in the text corresponding to the call voice At least one of a number and a probability;
根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;Obtaining, according to a characteristic parameter of the sample and a set training model, a training result output by the training model, where the training model uses the feature parameter as a classification parameter;
判断所述训练结果是否符合所述样本的样本类型;Determining whether the training result meets the sample type of the sample;
如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。If the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, and training the training result to satisfy the sample type of the sample The model is output as a preset recognition model.
本发明实施例提供了一种识别恶意电话装置,所述装置包括:第一获取单元、识别单元、第二获取单元、输出单元,其中,An embodiment of the present invention provides a device for identifying a malicious phone, where the device includes: a first acquiring unit, an identifying unit, a second acquiring unit, and an output unit, where
所述第一获取单元,配置为获取第一通话事件的特征参数,所述第一 通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;The first acquiring unit is configured to acquire a feature parameter of the first call event, where the first The call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a call At least one of the number and probability of the first keyword in the text corresponding to the voice;
所述识别单元,配置为根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;The identifying unit is configured to identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, where the recognition model uses the feature parameter as a classification parameter;
所述第二获取单元,配置为获取所述识别模型识别出的所述第一通话事件的识别结果;The second acquiring unit is configured to acquire a recognition result of the first call event identified by the identification model;
所述第一输出单元,配置为输出所述第一通话事件的识别结果。The first output unit is configured to output a recognition result of the first call event.
本发明实施例提供了一种建立恶意模型装置,所述装置包括:第二确定单元、第三获取单元、训练单元、判断单元、调整单元、第二输出单元,其中,An embodiment of the present invention provides a device for establishing a malicious model, where the device includes: a second determining unit, a third acquiring unit, a training unit, a determining unit, an adjusting unit, and a second output unit, where
所述第二确定单元,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;The second determining unit is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone ;
所述第三获取单元,配置为获取样本的特征参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;The third acquiring unit is configured to acquire a feature parameter of the sample, where the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability At least one;
所述训练单元,配置为根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;The training unit is configured to obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter;
所述判断单元,配置为判断所述训练结果是否符合所述样本的样本类型;The determining unit is configured to determine whether the training result meets a sample type of the sample;
所述调整单元,配置为在所述训练结果不满足所述样本的样本类型时, 调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型;The adjusting unit is configured to: when the training result does not satisfy the sample type of the sample, Adjusting model parameters of the training model until the training result satisfies a sample type of the sample;
所述第二输出单元,配置为将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second output unit is configured to output the training model that the training result satisfies the sample type of the sample as a preset recognition model.
本发明实施例提供了一种识别恶意电话的设备,所述设备包括:第一处理器和第一外部通信接口,或者,所述设备包括第一处理器和显示屏;,其中,An embodiment of the present invention provides a device for identifying a malicious phone, where the device includes: a first processor and a first external communication interface, or the device includes a first processor and a display screen;
所述第一处理器,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;获取所述识别模型识别出的所述第一通话事件的识别结果;通过所述第一外部通信接口输出所述第一通话事件的识别结果,或者通过所述显示屏显示所述第一通话事件的识别结果。The first processor is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing a call voice feature. a parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability; according to the feature of the first call event Identifying the first call event by using the parameter and the current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model; Outputting the recognition result of the first call event through the first external communication interface, or displaying the recognition result of the first call event through the display screen.
本发明实施例提供了一种建立恶意模型的设备,所述设备包括:第二处理器和第二外部通信接口,其中,An embodiment of the present invention provides a device for establishing a malicious model, where the device includes: a second processor and a second external communication interface, where
所述第二处理器,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;获取样本的特征参数,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;判断所述 训练结果是否符合所述样本的样本类型;如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,通过所述第二外部通信接口将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second processor is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone Obtaining a feature parameter of the sample, the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first keyword in the text corresponding to the call voice At least one of a number and a probability; obtaining a training result output by the training model according to a characteristic parameter of the sample and a set training model, wherein the training model uses the feature parameter as a classification parameter; Whether the training result conforms to the sample type of the sample; if the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, The second external communication interface outputs the training model in which the training result satisfies the sample type of the sample as a preset recognition model.
本发明实施例提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行本发明实施例提供的识别恶意电话及建立识别模型的方法。The embodiment of the present invention provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to perform the method for identifying a malicious phone and establishing a recognition model provided by the embodiments of the present invention.
本发明实施例提供了一种识别恶意电话及建立识别模型的方法、装置、设备、存储介质,其中,识别恶意电话的方法包括:获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数;根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;获取所述识别模型识别出的所述第一通话事件的识别结果;输出所述第一通话事件的识别结果;如此采用描述通话语音特征的参数作为识别标准,由于恶意用户在进行推销和诈骗等恶意通话时的语气和用语不会随意改变,这样就能够准确识别出恶意的通话事件,并输出识别结果来提醒用户免受诈骗,可以极大的减少用户的经济损失;另外,所述识别模型的建立需要不断地对训练模型进行训练,根据训练结果不断调整训练模型的模型参数,使最终的训练模型对样本识别的准召率达到最优,如此提升识别恶意电话的准确性。An embodiment of the present invention provides a method, an apparatus, a device, and a storage medium for identifying a malicious call and establishing a recognition model. The method for identifying a malicious call includes: acquiring a feature parameter of a first call event, where the first call event is a call event between the first user and the second user, the feature parameter includes a parameter for describing a voice feature of the call; the first parameter is determined according to a feature parameter of the first call event and a current preset recognition model The call event is identified, the recognition model takes the feature parameter as a classification parameter, acquires a recognition result of the first call event identified by the recognition model, and outputs a recognition result of the first call event; The parameters of the call voice feature are used as the identification criteria. Since the malicious user's tone and terminology during malicious calls such as sales promotion and fraud are not arbitrarily changed, the malicious call event can be accurately identified, and the recognition result is output to remind the user to be protected. Fraud can greatly reduce the user's economic loss; in addition, the identification model is built. Need to keep on training model for training, the training results continue to adjust the model parameters based on a training model, so that the final training model called quasi-rate optimal sample identification, so to enhance the accuracy of the identification of malicious calls.
附图说明DRAWINGS
图1为本发明实施例所涉及一种实施环境的示意图;1 is a schematic diagram of an implementation environment according to an embodiment of the present invention;
图2为本发明实施例识别恶意电话的方法的实现流程示意图;2 is a schematic flowchart of an implementation process of a method for identifying a malicious phone according to an embodiment of the present invention;
图3A为本发明实施例识别恶意电话的方法的第一种实现流程示意图;3A is a schematic diagram of a first implementation process of a method for identifying a malicious phone according to an embodiment of the present invention;
图3B为本发明实施例识别恶意电话的方法的第二种实现流程示意图; FIG. 3B is a schematic diagram of a second implementation process of a method for identifying a malicious phone according to an embodiment of the present invention; FIG.
图3C为本发明实施例识别恶意电话的方法的第三种实现流程示意图;3C is a schematic diagram of a third implementation process of a method for identifying a malicious phone according to an embodiment of the present invention;
图3D为本发明实施例识别恶意电话的方法的第四中实现流程示意图;3D is a schematic flowchart of a fourth implementation process of a method for identifying a malicious phone according to an embodiment of the present invention;
图4为本发明实施例识别恶意电话的装置的组成结构示意图;4 is a schematic structural diagram of a device for identifying a malicious phone according to an embodiment of the present invention;
图5为本发明实施例建立恶意模型的装置的组成结构示意图;FIG. 5 is a schematic structural diagram of a device for establishing a malicious model according to an embodiment of the present invention; FIG.
图6为本发明实施例识别恶意电话的设备的硬件组成结构示意图;6 is a schematic structural diagram of hardware components of a device for identifying a malicious phone according to an embodiment of the present invention;
图7为本发明实施例建立恶意模型的设备的硬件组成结构示意图。FIG. 7 is a schematic structural diagram of hardware components of a device for establishing a malicious model according to an embodiment of the present invention.
具体实施方式detailed description
下面介绍一下本发明实施例所涉及一种实施环境的示意图,如图1所示,该实施环境包括:第一终端11、第二终端12和设置在网络侧的服务器13;第一终端11和第二终端12之间通过网络中设置的服务器进行信息交互,第一终端11和第二终端12之间的信息交互中的一种可以是语音通话。本发明实施例涉及的是终端之间的语音通话场景。The following is a schematic diagram of an implementation environment according to an embodiment of the present invention. As shown in FIG. 1, the implementation environment includes: a first terminal 11, a second terminal 12, and a server 13 disposed on the network side; the first terminal 11 and The second terminal 12 exchanges information through a server set in the network, and one of the information exchanges between the first terminal 11 and the second terminal 12 may be a voice call. The embodiments of the present invention relate to a voice call scenario between terminals.
第一终端11或第二终端12可以为移动终端,比如手机、平板电脑等;也可以是固定终端如固定电话等。第一终端11和第二终端12中都运行有具有通话功能的客户端,该客户端还可以记录一段时间内其所在终端的通话行为如双方的通话号码、通话时间等,也可以缓存当前通话的通话语音信息等;如此,第一终端11和第二终端12就可以确定以下实施例中两用户之间通话事件并提取该通话事件的特征参数;该客户端可以是应用程序客户端,也可以是网页客户端。在本发明实施例中,通话的类型包括但不限于:语音通话、视频通话中的任意一种。The first terminal 11 or the second terminal 12 may be a mobile terminal, such as a mobile phone, a tablet computer, or the like; or may be a fixed terminal such as a fixed telephone. A client having a call function is run in both the first terminal 11 and the second terminal 12. The client can also record the call behavior of the terminal where the terminal is located, such as the call number and the call time of the two parties, and can also cache the current call. The call voice information and the like; in this way, the first terminal 11 and the second terminal 12 can determine the call event between the two users in the following embodiments and extract the feature parameters of the call event; the client can be an application client, Can be a web client. In the embodiment of the present invention, the type of the call includes, but is not limited to, any one of a voice call and a video call.
服务器13由运营商提供,可以是一台服务器,也可以是多台服务器组成的服务器集群,或者是一个云计算服务中心。服务器13配置为承载用于控制用户通话的控制信令,例如呼叫、应答和拒接等信令,并转发第一终端11和第二终端12之间的通话语音信息;如此,第一终端11和第二终端12就可以确定以下实施例中两用户之间通话事件并提取该通话事件的特征 参数。第一终端11和第二终端12通过与服务器13之间建立的通信连接,完成第一终端11和第二终端12之间的通话交互。该通信连接通常为TCP/IP(Transmission Control Protocol/Internet Protocol,传输控制协议/网络互连协议)连接。The server 13 is provided by an operator, and may be a server, a server cluster composed of multiple servers, or a cloud computing service center. The server 13 is configured to carry control signaling for controlling the user's call, such as call, answer, and reject, and forward the call voice information between the first terminal 11 and the second terminal 12; thus, the first terminal 11 And the second terminal 12 can determine the call event between the two users in the following embodiments and extract the characteristics of the call event. parameter. The first terminal 11 and the second terminal 12 complete the call interaction between the first terminal 11 and the second terminal 12 through a communication connection established with the server 13. The communication connection is usually a TCP/IP (Transmission Control Protocol/Internet Protocol) connection.
下面结合附图和具体实施例对本发明的技术方案进一步详细阐述。The technical solutions of the present invention are further elaborated below in conjunction with the accompanying drawings and specific embodiments.
为了解决背景技术中存在的问题,本发明实施例提供一种识别恶意电话的方法,应用于计算设备,该识别恶意电话的方法所实现的功能可以通过计算设备中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中,可见,该计算设备至少包括处理器和存储介质。所述计算设备可以是任何具有信息处理能够的电子设备,例如可以是终端、服务器,其中终端可以是平板电脑、手机等具有通话能力的计算设备。In order to solve the problems in the prior art, the embodiments of the present invention provide a method for identifying a malicious phone, which is applied to a computing device, and the function implemented by the method for identifying a malicious phone can be implemented by a processor calling program code in the computing device. Of course, the program code can be stored in a computer storage medium. As can be seen, the computing device includes at least a processor and a storage medium. The computing device can be any electronic device capable of information processing, for example, a terminal, a server, where the terminal can be a computing device with a call capability such as a tablet or a mobile phone.
图2为本发明实施例识别恶意电话的方法的实现流程示意图,如图2所示,该识别恶意电话的方法包括:FIG. 2 is a schematic flowchart of a method for identifying a malicious phone according to an embodiment of the present invention. As shown in FIG. 2, the method for identifying a malicious phone includes:
步骤S101、获取第一通话事件的特征参数。Step S101: Acquire a feature parameter of the first call event.
所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;由于恶意用户进行通话的目的一般都是进行诈骗和推销,故语气和语调通常都很温和,经常用语也很类似,故可以对通话语音进行分析,获得通话语音的特征参数,由描述通话语音特征的参数来进行恶意电话的识别。The first call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform of the call voice At least one of the number of the first keyword and the probability in the text corresponding to the feature parameter and the call voice; since the purpose of the call by the malicious user is generally fraud and promotion, the tone and tone are usually mild, and the term is often used. Very similar, it is possible to analyze the call voice, obtain the feature parameters of the call voice, and identify the malicious call by the parameters describing the call voice feature.
在本发明的其他实施例中,所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述第一用户的通话行为特征的第二特征参数。In other embodiments of the present invention, the parameter for describing a call voice feature is a first feature parameter, and the feature parameter further includes a second feature parameter for describing a call behavior feature of the first user.
在本发明的其他实施例中,获取第一通话事件的特征参数有以下两种 实现方式:In other embodiments of the present invention, there are two types of feature parameters for acquiring the first call event. Method to realize:
第一种实现方式是:确定第一通话事件;此时,对应地,所述获取第一通话事件的特征参数包括:提取所述第一通话事件的特征参数。所述计算设备可以实现为第一终端11、第二终端12或服务器13,第一终端11与第二终端12通过服务器13进行通话时,第一终端11、第二终端12或服务器13都可以确定第一用户与第二用户之间的第一通话事件,并提取所述第一通话事件的特征参数。The first implementation manner is: determining a first call event; at this time, correspondingly, acquiring the feature parameter of the first call event includes: extracting a feature parameter of the first call event. The computing device can be implemented as the first terminal 11, the second terminal 12 or the server 13. When the first terminal 11 and the second terminal 12 make a call through the server 13, the first terminal 11, the second terminal 12 or the server 13 can Determining a first call event between the first user and the second user, and extracting feature parameters of the first call event.
第二种实现方式是:计算设备实现为第一终端,此时,所述计算设备获取第一通话事件的特征参数包括:第一终端接收服务器发送的所述第一通话事件的特征参数,所述第一终端对应第一用户。所述计算设备也可以是第二终端,如果所述计算设备为第一终端或第二终端,为了减轻计算设备的负载,可以在服务器13侧提取所述第一通话事件的特征参数,然后,将所述第一通话事件的特征参数发送给计算设备。The second implementation manner is: the computing device is implemented as the first terminal, and the acquiring, by the computing device, the feature parameter of the first call event includes: receiving, by the first terminal, a feature parameter of the first call event sent by the server, where The first terminal corresponds to the first user. The computing device may also be a second terminal. If the computing device is the first terminal or the second terminal, in order to reduce the load of the computing device, the characteristic parameters of the first call event may be extracted on the server 13 side, and then, Transmitting the characteristic parameters of the first call event to the computing device.
步骤S102、根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数。Step S102: Identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, where the feature model uses the feature parameter as a classification parameter.
所述第一通话事件的特征参数为所述识别模型的输入,所述识别结果为所述识别模型的输出。所述识别模型可以包括各种分类算法的模型,其中所述分类算法包括逻辑回归算法(Logistic Regression,LR)、支持向量机(Support Vector Machine,SVM)和梯度提升决策树(Gradient Boosting Decision Tree,GBDT)等等。The feature parameter of the first call event is an input of the recognition model, and the recognition result is an output of the recognition model. The recognition model may include models of various classification algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting Decision Tree (Gradient Boosting Decision Tree). GBDT) and so on.
步骤S103、获取所述识别模型识别出的所述第一通话事件的识别结果。Step S103: Acquire a recognition result of the first call event identified by the recognition model.
步骤S104、输出所述第一通话事件的识别结果。Step S104: Output a recognition result of the first call event.
第一终端对应第一用户,第二终端对应第二用户,当计算设备为第一终端或第二终端时,所述输出所述第一通话事件的识别结果可以包括:在所述计算设备的显示界面上显示所述第一通话事件的识别结果;当所述计 算设备为服务器时,所述输出所述第一通话事件的识别结果可以包括:所述服务器将所述第一通话事件的识别结果通过通信装置(外部通信接口)发送给第一终端和第二终端。The first terminal corresponds to the first user, and the second terminal corresponds to the second user. When the computing device is the first terminal or the second terminal, the outputting the result of the first call event may include: at the computing device Displaying a recognition result of the first call event on the display interface; When the computing device is a server, the outputting the identification result of the first call event may include: the server sending the identification result of the first call event to the first terminal and the second through a communication device (external communication interface) terminal.
本发明实施例中,采用描述通话语音特征的参数作为识别标准,由于恶意用户在进行推销和诈骗等恶意通话时的语气和用语不会随意改变,这样就能够准确识别出恶意的通话事件,并输出识别结果来提醒用户免受诈骗,可以极大的减少用户的经济损失。In the embodiment of the present invention, the parameter describing the characteristics of the call voice is used as the identification standard, and the tone and the term of the malicious user during the malicious call such as promotion and fraud are not arbitrarily changed, so that the malicious call event can be accurately identified, and Output recognition results to alert users to fraud, which can greatly reduce the user's economic loss.
基于前述的实施例,本发明实施例提供一种基于引入机器学习技术而形成一种识别模型,机器学习是指依托概率论,统计学,神经传播等理论,使计算机能够模拟人类的学习行为,以获取新的知识或者技能,重新组织已有知识结构使之不断改善自身的性能。在形成识别模型的初期,需要人工挑选尽可能多的正常通话事件和恶意通话事件作为正负样本供机器学习模型训练。本实施例基于机器学习模型的识别恶意电话,识别逻辑非常复杂,恶意用户无法通过简单的调整通话号码等方式进行探测破解,另外由于模型自身具有进化学习的功能,即使恶意用户变更通话模式,通过简单的重新进行模型训练,即可以识别新的恶意通话模式并进行训练,使恶意用户始终难以绕过识别策略。Based on the foregoing embodiments, an embodiment of the present invention provides a recognition model based on the introduction of machine learning technology. The machine learning refers to a theory of probability theory, statistics, and neural propagation, so that a computer can simulate human learning behavior. To acquire new knowledge or skills, reorganize existing knowledge structures to continuously improve their performance. In the initial stage of forming the recognition model, it is necessary to manually select as many normal call events and malicious call events as positive and negative samples for machine learning model training. In this embodiment, the identification of the malicious phone based on the machine learning model is very complicated, and the malicious user cannot detect and crack by simply adjusting the call number, and the model itself has the function of evolutionary learning, even if the malicious user changes the call mode, Simply re-training the model can identify new malicious call patterns and train them, making it difficult for malicious users to bypass the recognition strategy.
机器学习技术在识别恶意电话中的应用可以自由的分享和传播,因为机器学习识别的原理复杂且可以自我进化,不针对特定某种通话模式,因此甚至对恶意用户一样可以公开基于机器学习模型的识别恶意电话的方法。基于前述的实施例,本发明实施例提供一种建立识别模型的方法,应用于计算设备,该建立识别模型的方法所实现的功能可以通过计算设备中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中,可见,该计算设备至少包括处理器和存储介质。该建立识别模型的方法包括: The application of machine learning technology in identifying malicious phones can be shared and disseminated freely, because the principle of machine learning recognition is complex and self-evolving, not specific to a certain call mode, so even for malicious users, it can be disclosed based on machine learning models. A way to identify a malicious call. Based on the foregoing embodiments, an embodiment of the present invention provides a method for establishing a recognition model, which is applied to a computing device, and the function implemented by the method for establishing a recognition model may be implemented by a processor calling program code in a computing device, of course, the program The code can be stored in a computer storage medium, as seen, the computing device includes at least a processor and a storage medium. The method for establishing a recognition model includes:
步骤S201、确定所述样本的样本类型。Step S201, determining a sample type of the sample.
所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本。所述样本类型可以靠人工回访的方式进行确定,示例地,通过统计发现某个用户在预设时间段内拨打陌生电话的个数超过一定阈值,则人工拨打该用户的各对端用户进行回访,确认两用户之间的通话事件是否为属于恶意电话,如果属于恶意电话则将通话事件确定为负样本,如果不属于恶意电话则将通话事件确定为正样本。The sample type includes a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, and the negative sample being a sample not belonging to a malicious phone. The sample type can be determined by means of a manual return visit. For example, by collecting statistics, if the number of unfamiliar calls made by a certain user within a preset time period exceeds a certain threshold, the peer users of the user are manually dialed back. To confirm whether the call event between the two users is a malicious call, if it is a malicious call, determine the call event as a negative sample, and if it is not a malicious call, determine the call event as a positive sample.
正负样本的确定纯靠人工存在样本规模有限且成本高的问题,故本发明实施例还可以采用程序自动化抽取正样本和负样本。正样本的确定可以采用基于规则的确定方式和基于统计的确定方式相结合的方式来确定,基于规则的识别方式用于对大规模的通话事件作为样本做粗略地筛选,其中基于规则的识别方式中,可以预设一定的规则对样本粗略筛选之后,再通过基于统计的识别方式进行筛选,例如选出被标记为恶意电话的次数、和陌生电话的通话次数超过一定阈值(该阈值是统计得出的,因此该筛选方式称为基于统计的识别方式)的用户,然后使用交叉过滤的方法来对样本进行清洗,最终得到正样本和负样本,其中正常通话和恶意通话会存在一定的比例,这个比例即为配置比例,本实施例中得到的正样本和负样本要符合该配置比例。The determination of the positive and negative samples is purely based on the problem that the sample size is limited and the cost is high. Therefore, the embodiment of the present invention can also automatically extract positive and negative samples by using the program. The determination of the positive samples can be determined by a combination of a rule-based determination method and a statistical-based determination method. The rule-based identification method is used for roughly screening large-scale call events as samples, wherein the rule-based identification method is adopted. In the process, a certain rule may be preset to roughly filter the sample, and then the statistic-based identification method is used for screening, for example, the number of times marked as a malicious call and the number of calls of a strange phone exceed a certain threshold (the threshold is statistically The user, and therefore the screening method is called a statistical-based identification method, and then uses the cross-filtering method to clean the sample, and finally obtains a positive sample and a negative sample, wherein there is a certain proportion of normal calls and malicious calls. This ratio is the configuration ratio, and the positive and negative samples obtained in this embodiment are to comply with the configuration ratio.
步骤S202、获取样本的特征参数。Step S202: Acquire feature parameters of the sample.
所述特征参数包括用于描述通话语音特征的参数。所述获取样本的特征参数包括:获取样本的通话语音信息;从所述样本的通话语音信息中提取所述特征参数,所述特征参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个。The feature parameters include parameters used to describe the characteristics of the call voice. The acquiring the feature parameters of the sample includes: acquiring the call voice information of the sample; and extracting the feature parameter from the call voice information of the sample, where the feature parameter includes: a waveform feature parameter of the call voice, and a text corresponding to the call voice. At least one of the number and probability of the first keyword.
示例地,获取通话语音的波形特征参数包括:从所述样本的通话语音信息中提取通话语音的波形,所述波形包括时域波形或频域波形;提取所 述波形的波形特征参数,所述波形特征参数包括波峰幅度值、波谷幅度值、波形幅度平均值、波峰位置和波谷位置中的至少一个。For example, acquiring waveform feature parameters of the call voice includes: extracting a waveform of the call voice from the call voice information of the sample, the waveform including a time domain waveform or a frequency domain waveform; A waveform characteristic parameter of the waveform, the waveform characteristic parameter including at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a trough position.
示例地,获取通话语音对应的文本中第一关键字的个数或概率包括:对所述样本的通话语音信息进行语音识别,获得通话语音对应的文本;提取所述文本中的文本关键字;比较所述文本关键字与预设的第一关键字,确定所述文本关键字中所述第一关键字的个数或概率。恶意用户进行通话的目的一般都是进行诈骗和推销,故可以统计出诈骗和推销时经常用到的字作为第一关键字如“钱”、“中奖”、“买”、“银行”、“产品”等。For example, obtaining the number or probability of the first keyword in the text corresponding to the call voice includes: performing voice recognition on the call voice information of the sample, obtaining text corresponding to the call voice; and extracting a text keyword in the text; Comparing the text keyword with the preset first keyword, determining the number or probability of the first keyword in the text keyword. The purpose of malicious users to conduct calls is generally to scam and sell, so you can count the words often used in fraud and promotion as the first keywords such as "money", "winning", "buy", "bank", " Product" and so on.
在本发明的其他实施例中,所述用于描述通话语音特征的参数可以记为第一特征参数,所述特征参数还包括用于描述通话行为特征的第二特征参数。在其他实施例中,可以先判断本样本中通话双方中的可疑用户,如采集所述通话双方的两个用户在第一预设时间段内的第一通话行为;并根据所述两个用户在第一预设时间段内的第一通话行为,确定所述通话双方中的可疑用户;示例地,由于恶意用户通常都会频繁地与陌生电话通话,故可以统计在一天时间内通话双方与陌生电话的通话次数,将与陌生电话的通话次数多的用户为可疑用户。In other embodiments of the present invention, the parameter for describing a call voice feature may be recorded as a first feature parameter, and the feature parameter further includes a second feature parameter for describing a call behavior feature. In other embodiments, the suspicious user in the two parties in the sample may be first determined, for example, the first call behavior of the two users of the two parties in the first preset time period is collected; and according to the two users In the first call behavior in the first preset time period, the suspicious users in the two parties are determined; for example, since the malicious users usually frequently talk to the strange telephone, it is possible to count the two parties and the stranger in one day. The number of calls made by the phone, and the number of users who have more calls with strange calls is a suspicious user.
所述第二特征参数可以是描述非可疑用户的通话行为特征的参数,包括:在第二预设时间段内,与标记为恶意用户的通话次数和通话平均时长、与陌生用户的通话次数、与海外用户的通话次数中的至少一个。所述第二特征参数也可以是描述可疑用户的通话行为特征的参数,包括:在第二预设时间段内,与标记为恶意用户的通话次数和通话平均时长、与陌生用户的通话次数和通话平均时长、与海外用户的通话次数、被标记为恶意用户的次数等等中的至少一个。The second feature parameter may be a parameter describing a call behavior feature of the non-suspicious user, including: the number of calls marked as a malicious user, the average duration of the call, and the number of calls with the unfamiliar user in the second preset time period, At least one of the number of calls with overseas users. The second feature parameter may also be a parameter describing a call behavior feature of the suspicious user, including: the number of calls marked as a malicious user, the average duration of the call, and the number of calls with the unfamiliar user during the second preset time period. At least one of the average duration of the call, the number of calls to the overseas user, the number of times marked as a malicious user, and the like.
示例地,如表1所示为用于训练识别模型的训练集中的一种: By way of example, as shown in Table 1, one of the training sets for training the recognition model is:
Figure PCTCN2017074169-appb-000001
Figure PCTCN2017074169-appb-000001
表1Table 1
表1中所示的通话行为特征表项下的“与标记为恶意用户的通话次数”“与标记为恶意用户的通话平均时长”“与海外用户的通话次数”“与陌生用户的通话次数”“被标记情况”即为本实施例中所述的第二特征参数的示例;各参数的参数值都是在预设时间段内的统计结果,该预设时间段可以是本次通话事件开始之前的一天。表1中所示的语音特征表项下的“时域波形参数”“频域波形参数”“通话语音对应的文本中第一关键字的个数”等即为本实施例中所述本次通话事件的第一特征参数;所述时域波形参数如上所述可以包括很多种参数(如波峰幅度值、波谷幅度值、波形幅度平均值、波峰位置和波谷位置等),这些参数可以形成参数向量如“向量1”、“向量2”、“向量3”等;频域波形参数如上所述也可以包括很多种参数,这些参数可以形成参数向量如“向量4”、“向量5”、“向量6”等。表1中所示的是否为恶意通话表项中表示本次通话事件是否为恶意电话,如果为“是”则表明该样本为正样本,如果为“否”则表明该样本为负样本,如 表1中所示,所述样本1为正样本,所述样本2和样本3为负样本。The number of calls marked as malicious users under the call behavior feature table shown in Table 1 "The average duration of calls marked as malicious users" "Number of calls with overseas users" "Number of calls with strange users" The "marked condition" is an example of the second characteristic parameter described in the embodiment; the parameter values of each parameter are statistical results in a preset time period, and the preset time period may be the start of the current call event. The day before. The "time domain waveform parameter", the "frequency domain waveform parameter", the "number of first keywords in the text corresponding to the call voice", and the like in the voice feature table shown in Table 1 are the same as described in this embodiment. The first characteristic parameter of the call event; the time domain waveform parameter may include a plurality of parameters (such as peak amplitude value, valley amplitude value, waveform amplitude average, peak position, and trough position, etc.) as described above, and these parameters may form parameters. Vectors such as "Vector 1", "Vector 2", "Vector 3", etc.; frequency domain waveform parameters may also include a variety of parameters as described above, which may form parameter vectors such as "Vector 4", "Vector 5", " Vector 6" and so on. Whether the malicious call list in Table 1 indicates whether the call event is a malicious call, if it is "Yes", the sample is a positive sample, and if it is "No", the sample is a negative sample, such as As shown in Table 1, the sample 1 is a positive sample, and the sample 2 and the sample 3 are negative samples.
步骤S203、根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数。Step S203: Obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter.
所述训练模型可以包括各种分类算法的模型,其中所述分类算法包括逻辑回归算法、支持向量机和梯度提升决策树等等。The training model may include models of various classification algorithms including logistic regression algorithms, support vector machines, gradient elevation decision trees, and the like.
步骤S204、判断所述训练结果是否符合所述样本的样本类型。Step S204: Determine whether the training result meets the sample type of the sample.
步骤S205、如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。Step S205: If the training result does not satisfy the sample type of the sample, adjust the model parameter of the training model until the training result satisfies the sample type of the sample, and the training result satisfies the sample of the sample. The type of training model is output as a preset recognition model.
所述训练模型可以有多个,如时域波形训练模型、频域波形训练模型、通话行为训练模型等,可以将所述样本中的时域波形参数作为时域波形训练模型的输入,将频域波形参数作为频域波形训练模型的输入,将通话行为特征作为通话行为训练模型的输入等,得到各个训练模型的训练结果,只要各个训练模型的训练结果都满足所述样本的样本类型,则可以将这些训练模型作为预设的识别模型输出。The training model may have multiple, such as a time domain waveform training model, a frequency domain waveform training model, a call behavior training model, etc., and the time domain waveform parameter in the sample may be used as an input of a time domain waveform training model, and the frequency is The domain waveform parameter is used as the input of the frequency domain waveform training model, and the call behavior feature is used as the input of the call behavior training model, etc., and the training results of each training model are obtained, as long as the training results of the respective training models satisfy the sample type of the sample, These training models can be output as a preset recognition model.
本发明实施例中,不管采用何种训练模型,在开始训练之时,该训练模型的输入包括上述的特征参数,将各个样本的特征参数作为训练模型的输入,就可以从所述训练模型获得各个训练结果。In the embodiment of the present invention, regardless of the training model, when the training is started, the input of the training model includes the above-mentioned feature parameters, and the feature parameters of each sample are used as input of the training model, and the training model can be obtained from the training model. Various training results.
如果训练模型根据各个样本的特征参数得出的各个训练结果都满足该样本的样本类型,即正样本的特征参数输入训练模型后,得出的训练结果表明该特征参数对应的样本为正样本,负样本的特征参数输入训练模型后,得出的训练结果表明该特征参数对应的样本为负样本,则将所述训练结果满足所述样本的样本类型的训练模型。If the training model obtains the sample type of the sample according to the characteristic parameters of each sample, that is, after the feature parameter of the positive sample is input into the training model, the obtained training result indicates that the sample corresponding to the feature parameter is a positive sample. After the characteristic parameter of the negative sample is input into the training model, the obtained training result indicates that the sample corresponding to the feature parameter is a negative sample, and the training result satisfies the training model of the sample type of the sample.
如果训练模型根据各个样本的特征参数得出的各个样本对应的训练结 果存在不满足该样本的样本类型,即正样本的特征参数输入训练模型后,得出的训练结果表明该特征参数对应的样本为负样本,或负样本的特征参数输入训练模型后,得出的训练结果表明该特征参数对应的样本为正样本,则调整所述训练模型的模型参数直至所有样本对应的训练结果都满足该样本的样本类型;然后将调整后的,所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。If the training model is based on the characteristic parameters of each sample, the training knot corresponding to each sample If there is a sample type that does not satisfy the sample, that is, after the feature parameter of the positive sample is input into the training model, the obtained training result indicates that the sample corresponding to the feature parameter is a negative sample, or the characteristic parameter of the negative sample is input into the training model, and The training result indicates that the sample corresponding to the feature parameter is a positive sample, and then the model parameters of the training model are adjusted until the training results corresponding to all the samples satisfy the sample type of the sample; and then the adjusted training result is satisfied. The training model of the sample type of the sample is output as a preset recognition model.
在本发明的其他实施例中,样本的特征参数中包括用于描述通话语音特征的第一特征参数和用于描述通话行为特征的第二特征参数;所述训练模型包括第一子训练模型和第二子训练模型,此时的建立识别模型的方法:In other embodiments of the present invention, the feature parameters of the sample include a first feature parameter for describing a call speech feature and a second feature parameter for describing a call behavior feature; the training model includes a first sub-training model and The second sub-training model, the method of establishing the recognition model at this time:
步骤A1、根据所述第二特征参数和所述第一子训练模型,对所述样本进行识别,所述第一子训练模型以所述第二特征参数为分类参数;获取所述第一子训练模型输出的所述样本的第一训练结果;在所述第一训练结果不满足所述样本的样本类型时,调整所述第一训练模型的模型参数直至所述第一训练结果满足所述样本的样本类型;Step A1: Identify the sample according to the second feature parameter and the first sub-training model, where the first sub-training model uses the second feature parameter as a classification parameter; acquiring the first sub-child Training a first training result of the sample output by the model; adjusting the model parameter of the first training model until the first training result satisfies the sample when the first training result does not satisfy the sample type of the sample The sample type of the sample;
步骤A2、根据第三特征参数和所述第二子训练模型,对所述样本进行识别,所述第二子训练模型以所述第三特征参数为分类参数,所述第三特征参数为所述第二特征参数或所述特征参数;获取所述第二子训练模型输出的第二子训练结果;在所述第二子训练结果不满足所述样本的样本类型时,调整所述第二子训练模型的模型参数直至所述第二训练结果满足所述样本的样本类型;Step A2: Identify the sample according to the third feature parameter and the second sub-training model, where the second sub-training model uses the third feature parameter as a classification parameter, and the third feature parameter is a Determining a second feature parameter or the feature parameter; acquiring a second sub-training result output by the second sub-training model; adjusting the second when the second sub-training result does not satisfy the sample type of the sample Model parameters of the sub-training model until the second training result satisfies the sample type of the sample;
步骤A3、将所述第一训练结果满足所述样本的样本类型的第一子训练模型作为预设的第一子识别模型输出,将所述第二训练结果满足所述样本的样本类型的第二子训练模型作为预设的第二子识别模型输出。Step A3: output, as a preset first sub-recognition model, the first sub-training model that satisfies the sample type of the sample, and the second training result satisfies the sample type of the sample. The second sub-training model is output as a preset second sub-recognition model.
本发明实施例中采用描述描述通话语音特征的第一特征参数来对训练模型进行训练,根据训练结果通过不断调整训练模型的模型参数,使最终 的训练模型对样本识别的准召率达到最优,如此提升识别恶意电话的准确性。且本发明实施例采用的识别模型的一个显著特点是模型可以自我进化,根据通话语音或通话行为的变换自动进行模型参数的调整,避免基于规则的人工频繁介入调整参数。In the embodiment of the present invention, the first feature parameter describing the characteristics of the call voice is used to train the training model, and the model parameters of the training model are continuously adjusted according to the training result, so that the final The training model optimizes the call rate for sample identification, thus improving the accuracy of identifying malicious calls. And a distinguishing feature of the recognition model adopted by the embodiment of the present invention is that the model can self-evolve, and automatically adjust the model parameters according to the change of the call voice or the call behavior, thereby avoiding the rule-based manual frequent intervention adjustment parameters.
基于前述的实施例,本发明实施例提供一种识别恶意电话的方法,应用于计算设备,所述计算设备实现为服务器,该识别恶意电话的方法所实现的功能可以通过服务器中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中,可见,该服务器至少包括处理器和存储介质。Based on the foregoing embodiments, an embodiment of the present invention provides a method for identifying a malicious phone, which is applied to a computing device, where the computing device is implemented as a server, and the function implemented by the method for identifying a malicious phone may be invoked by a processor in a server. The program code is implemented. Of course, the program code can be stored in a computer storage medium. As can be seen, the server includes at least a processor and a storage medium.
图3A为本发明实施例识别恶意电话的方法的实现流程示意图,如图3A所示,该识别恶意电话的方法包括:3A is a schematic flowchart of an implementation of a method for identifying a malicious phone according to an embodiment of the present invention. As shown in FIG. 3A, the method for identifying a malicious phone includes:
步骤S301、服务器确定第一通话事件,提取所述第一通话事件的特征参数。Step S301: The server determines a first call event, and extracts a feature parameter of the first call event.
第一用户通过所述服务器与第二用户建立通信连接,进而实现第一用户和第二用户之间的通话,服务器用于承载用于控制用户通话的控制信令,例如呼叫、应答和拒接等信令,并转发第一终端11和第二终端12之间的通话语音信息。故,所述服务器可以确定所述第一用户与第二用户之间的通话事件以及第一用户和第二用户的通话行为信息。The first user establishes a communication connection with the second user through the server, thereby implementing a call between the first user and the second user, and the server is configured to carry control signaling for controlling the user's call, such as calling, answering, and rejecting. The signaling is forwarded, and the call voice information between the first terminal 11 and the second terminal 12 is forwarded. Therefore, the server may determine a call event between the first user and the second user and call behavior information of the first user and the second user.
所述特征参数包括用于描述通话语音特征的第一特征参数和用于描述通话行为特征的第二特征参数。The feature parameters include a first feature parameter for describing a call voice feature and a second feature parameter for describing a call behavior feature.
由于服务器13可以转发第一终端11和第二终端12之间的通话语音信息,第一终端对应第一用户,第二终端对应第二用户;故所述提取所述第一通话事件的第一特征参数可以包括:获取所述第一通话事件的通话语音信息;从所述第一通话事件的通话语音信息中提取所述第一特征参数,所述第一特征参数包括:通话语音的波形特征参数、通话语音对应的文本中 第一关键字的个数和概率中的至少一个。The server 13 can forward the call voice information between the first terminal 11 and the second terminal 12, the first terminal corresponds to the first user, and the second terminal corresponds to the second user; therefore, the first session of the first call event is extracted. The feature parameter may include: acquiring call voice information of the first call event; extracting the first feature parameter from call voice information of the first call event, where the first feature parameter includes: waveform feature of the call voice In the text corresponding to the parameter and call voice At least one of the number and probability of the first keyword.
在本发明的其他实施例中,所述服务器可以从所述第一通话事件的通话语音信息中提取通话语音的波形,所述波形包括时域波形或频域波形;提取所述波形的波形特征参数,所述波形特征参数包括波峰幅度值、波谷幅度值、波形幅度平均值、波峰位置和波谷位置中的至少一个。In other embodiments of the present invention, the server may extract a waveform of a call voice from the call voice information of the first call event, where the waveform includes a time domain waveform or a frequency domain waveform; and extracting waveform characteristics of the waveform a parameter, the waveform characteristic parameter comprising at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a trough position.
所述服务器也可以对所述第一通话事件的通话语音信息进行语音识别,获得通话语音对应的文本;提取所述文本中的文本关键字;比较所述文本关键字与预设的第一关键字,确定所述文本关键字中所述第一关键字的个数或概率。示例地,恶意用户进行通话的目的一般都是进行诈骗和推销,故可以统计出诈骗和推销时经常用到的字作为第一关键字如“钱”、“中奖”、“买”、“银行”、“产品”等。The server may also perform voice recognition on the call voice information of the first call event, obtain text corresponding to the call voice, extract a text keyword in the text, and compare the text keyword with a preset first key. a word determining a number or probability of the first keyword in the text keyword. For example, the purpose of a malicious user to make a call is generally to scam and sell, so it is possible to count the words often used in fraud and promotion as the first keywords such as "money", "winning", "buy", "banking "," "products" and so on.
服务器可以采集所述第一用户和所述第二用户在第一预设时间段内的第一通话行为;根据所述第一用户和第二用户在第一预设时间段内的第一通话行为,确定所述第一用户是否为可疑用户;示例地,由于恶意用户通常都会频繁地与陌生电话通话,故可以统计在一天时间内通话双方(第一用户和第二用户)与陌生电话的通话次数,将与陌生电话的通话次数多的用户为可疑用户。The server may collect the first call behavior of the first user and the second user in the first preset time period; according to the first call of the first user and the second user in the first preset time period Behavior, determining whether the first user is a suspicious user; for example, since a malicious user usually frequently talks to a strange phone, it is possible to count the two parties (first user and second user) and the strange phone in one day. The number of calls, the number of users who have more calls with strange calls is suspicious.
所述第二特征参数可以是描述非可疑用户的通话行为特征的参数,故如果所述第一用户不是可疑用户,则所述服务器从所述第一用户的通话行为信息中用于描述第一用户的通话行为特征的第二特征参数,所述第二特征参数包括:在第二预设时间段内,与标记为恶意用户的通话次数和通话平均时长、与陌生用户的通话次数、与海外用户的通话次数中的至少一个;所述第二特征参数可以是描述可疑用户的通话行为特征的参数,如果所述第一用户是可疑用户,则所述服务器从所述第一用户的通话行为信息中用于描述第一用户的通话行为特征的第二特征参数,所述第二特征参数包括: 在第三预设时间段内,与陌生用户的通话次数和通话平均时长、与海外用户的通话次数中的至少一个。The second feature parameter may be a parameter describing a call behavior feature of the non-suspicious user, so if the first user is not a suspicious user, the server is used to describe the first from the call behavior information of the first user. The second characteristic parameter of the call behavior feature of the user, where the second feature parameter includes: the number of calls marked as a malicious user, the average duration of the call, the number of calls with the unfamiliar user, and the overseas time in the second preset time period At least one of the number of calls of the user; the second characteristic parameter may be a parameter describing a call behavior characteristic of the suspicious user, and if the first user is a suspicious user, the call behavior of the server from the first user a second feature parameter used in the information to describe a call behavior feature of the first user, the second feature parameter comprising: At least one of the number of calls with the unfamiliar user, the average duration of the call, and the number of calls with the overseas user during the third predetermined time period.
步骤S302、服务器根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数。Step S302: The server identifies the first call event according to the feature parameter of the first call event and the current preset recognition model, where the feature model uses the feature parameter as a classification parameter.
图3A中所示的线上模型即为当前预设的识别模型;所述当前预设的识别模型是由服务器采用前述实施例中所述的建立识别模型方法建立的。The online model shown in FIG. 3A is the current preset recognition model; the current preset recognition model is established by the server using the method for establishing a recognition model described in the foregoing embodiment.
所述识别模型包括第一子识别模型和第二子识别模型,则步骤S302包括以下步骤B1-B4:The recognition model includes a first sub-recognition model and a second sub-recognition model, and step S302 includes the following steps B1-B4:
步骤B1、根据所述第二特征参数和所述第一子识别模型,对所述第一通话事件进行识别,所述第一子识别模型以所述第二特征参数为分类参数。Step B1: Identify the first call event according to the second feature parameter and the first sub-identification model, where the first sub-identification model uses the second feature parameter as a classification parameter.
步骤B2、获取所述第一子识别模型识别出的所述第一通话事件的初始识别结果。Step B2: Acquire an initial recognition result of the first call event identified by the first sub-identification model.
步骤B3、如果所述初始识别结果满足第一预设条件,获取所述第一通话事件的第一特征参数。Step B3: Acquire a first feature parameter of the first call event if the initial recognition result satisfies a first preset condition.
如果所述初始识别结果满足第一预设条件,则表明所述第一通话事件可能为恶意事件,需要进行后续步骤对该第一通话事件进一步识别。如果所述初始识别结果不满足第一预设条件,则表明所述第一通话事件不是恶意事件,此时流程结束。If the initial recognition result satisfies the first preset condition, it indicates that the first call event may be a malicious event, and a subsequent step is needed to further identify the first call event. If the initial recognition result does not satisfy the first preset condition, it indicates that the first call event is not a malicious event, and the process ends.
步骤B4、根据所述第一通话事件的特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述特征参数为分类参数;或者,根据所述第一通话事件的第一特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述第一特征参数为分类参数。Step B4: Identify, according to the feature parameter of the first call event and the second sub-identification model, the first call event, where the second sub-recognition model uses the feature parameter as a classification parameter; or Determining, according to the first feature parameter of the first call event and the second sub-identification model, the first call event, wherein the second sub-identification model uses the first feature parameter as a classification parameter.
相应地,所述获取所述识别模型识别出的所述第一通话事件的识别结 果,包括:获取所述第二子识别模型识别出的所述第一通话事件的识别结果。Correspondingly, the acquiring the identification of the first call event identified by the recognition model The method includes: obtaining a recognition result of the first call event identified by the second sub-identification model.
步骤S303、服务器获取所述识别模型识别出的所述第一通话事件的识别结果,根据所述第一通话事件的识别结果确定所述第一通话事件的提醒指令。Step S303: The server acquires a recognition result of the first call event identified by the identification model, and determines a reminding instruction of the first call event according to the recognition result of the first call event.
本发明实施例中,服务器在所述第一通话事件的识别结果满足第二预设条件时,确定所述第一通话事件的提醒指令为第一提醒指令,所述第一提醒指令用于指示不向终端输出所述第一通话事件的识别结果;服务器在所述第一通话事件的识别结果满足第三预设条件时,确定所述第一通话事件的提醒指令为第二提醒指令,所述第二提醒指令用于指示向所述第一终端发送短信,所述短信中携带有所述第一通话事件的识别结果;服务器在所述第一通话事件的识别结果满足第四预设条件时,确定所述第一通话事件的提醒指令为第三提醒指令,所述第三提醒指令用于指示向所述第一终端发起通话,并在所述第一终端接听后向所述第一终端通知所述第一通话事件的识别结果。In the embodiment of the present invention, when the identification result of the first call event meets the second preset condition, the server determines that the reminding instruction of the first call event is a first reminding instruction, and the first reminding instruction is used to indicate And not outputting, to the terminal, the recognition result of the first call event; the server determining, when the recognition result of the first call event meets the third preset condition, that the reminding instruction of the first call event is the second reminding instruction, The second reminding instruction is configured to send a short message to the first terminal, where the short message carries the identification result of the first call event; and the server identifies that the first call event meets the fourth preset condition Determining, by the third terminal, that the reminding instruction of the first call event is a third reminding instruction, where the third reminding instruction is used to initiate a call to the first terminal, and after the first terminal answers the call to the first The terminal notifies the identification result of the first call event.
步骤S304、在所述第一用户不是可疑用户时,服务器根据所述第一通话事件的提醒指令向第一终端输出所述第一通话事件的识别结果。Step S304: When the first user is not a suspicious user, the server outputs the recognition result of the first call event to the first terminal according to the reminding instruction of the first call event.
所述第一终端对应所述第一用户,所述第二终端对应所述第二用户。The first terminal corresponds to the first user, and the second terminal corresponds to the second user.
示例地,可以假设所述识别模型识别出的识别结果为本次识别的通话事件为恶意电话的概率,所述第二预设条件是[a,b],所述第三预设条件是(b,c],所述第四预设条件是(c,d];假设,a为0,b为10%,c为50%,d为100%。如果所述第一通话事件的识别结果为5%,则服务器判定所述第一通话事件的识别结果满足第二预设条件,表明所述第一通话事件为无风险事件,此时所述服务器可以不提醒用户;如果所述第一通话事件的识别结果为30%,则服务器判定所述第一通话事件的识别结果满足第三预设 条件,表明所述第一通话时间为低风险事件,此时所述服务器可以向第一通话事件中的非可疑用户如第一用户发送提醒短信,短信内容可以为“亲爱的用户,您好,与您进行通话的号码为XXXXX的电话可能为恶意电话,请您加强防范”等;如果所述第一通话事件的识别结果为60%,则服务器判定所述第一通话事件的识别结果满足第四预设条件,表明所述第一通话时间为高风险事件,此时所述服务器可以向第一通话事件中的非可疑用户如第一用户发起通话,并在所述第一终端接听后自动向所述第一终端语音播报“亲爱的用户,您好,与您进行通话的号码为XXXXX的电话可能为恶意电话,请您加强防范”。当然,所述服务器也可以同时向两个用户都发起提醒。For example, it may be assumed that the recognition result identified by the recognition model is the probability that the identified call event is a malicious call, the second preset condition is [a, b], and the third preset condition is ( b, c], the fourth preset condition is (c, d); assuming that a is 0, b is 10%, c is 50%, and d is 100%. If the first call event is identified If the value is 5%, the server determines that the identification result of the first call event meets the second preset condition, indicating that the first call event is a risk-free event, and the server may not remind the user; If the recognition result of the call event is 30%, the server determines that the recognition result of the first call event satisfies the third preset The condition indicates that the first call time is a low-risk event, and the server may send a reminder message to a non-suspect user such as the first user in the first call event, and the content of the short message may be “Dear User, hello, The number of the call with XXXXX may be a malicious call, please strengthen the defense, etc.; if the recognition result of the first call event is 60%, the server determines that the recognition result of the first call event satisfies the first The four preset conditions indicate that the first call time is a high-risk event, and the server may initiate a call to a non-suspect user in the first call event, such as the first user, and automatically after the first terminal answers the call. Broadcast to the first terminal voice "Dear user, hello, the number XXXX call with you may be a malicious call, please strengthen your defense." Of course, the server can also initiate reminders to both users at the same time.
所述服务器还可以在所述第一通话事件的识别结果满足第二预设条件或第三预设条件时,将所述第二用户被标注为恶意用户的次数加1,这样当第二用户继续向其他用户发起恶意通话时,服务器可以将第二用户被标注为恶意用户的次数发送给该其他用户,提醒该其他用户注意。The server may further increase the number of times the second user is marked as a malicious user by 1 when the recognition result of the first call event satisfies the second preset condition or the third preset condition, so that the second user When continuing to initiate a malicious call to other users, the server may send the number of times the second user is marked as a malicious user to the other user, alerting the other user to the attention.
示例地,第二用户为推销人员,其在这一天内与陌生号码通话次数有很多次,当第二用户使用的第二终端拨打第一终端,向第一终端的第一用户推销自己的产品的场景下,由于第二用户为推销人员,其通话过程语气很温和,经常会说“我们这个产品XXX”“我们这个产品很优秀”“原价是XXX”、“现在买可以给你优惠XXX”“买我们的产品不会后悔”等之类的话。所述服务器在第一终端接通后,可以确定第一用户和第二用户之间的通话事件,并按照上述方法对此次的通话事件进行识别,最终识别结果为此次通话事件为低风险,此时,所述服务器就会向第一用户发送提醒短信,第一用户接收到所述提醒短信后,就会慎重考虑自己的行为,决定是否与第二用户继续沟通,进行购买或者向第二用户泄露自己的身份信息等;这样就可以防止第一用户遭受诈骗。 For example, the second user is a salesperson, and the number of calls with the strange number is many times in this day. When the second terminal used by the second user dials the first terminal, the first user of the first terminal is promoted to the product. In the scenario, because the second user is a salesperson, the tone of the call is very mild. I often say "our product XXX" "our product is very good" "original price is XXX", "Buy now can give you a discount XXX" "Buy our products will not regret" and so on. After the first terminal is connected, the server may determine a call event between the first user and the second user, and identify the call event according to the foregoing method, and the final recognition result is that the call event is low risk. At this time, the server sends a reminder message to the first user. After receiving the reminder message, the first user carefully considers his or her behavior and decides whether to continue communication with the second user to make a purchase or to the first The second user leaks his identity information, etc.; this prevents the first user from being defrauded.
步骤S305、服务器将所述第一通话事件的特征参数发送到服务器中的离线模型建立模块,所述离线模型建立模块将所述第一通话事件的特征参数作为所述样本的特征参数。Step S305: The server sends the feature parameter of the first call event to an offline model establishing module in the server, where the offline model establishing module uses a feature parameter of the first call event as a feature parameter of the sample.
服务器在提取到所述第一通话事件的特征参数后就可以进行步骤S305。After the server extracts the feature parameters of the first call event, the server may proceed to step S305.
如图3A所示,所述离线模型建立模块可以将所述第一通话事件的特征参数加入到训练集中,作为一个样本的特征参数。As shown in FIG. 3A, the offline model establishing module may add a feature parameter of the first call event to a training set as a feature parameter of a sample.
步骤S306、服务器确定所述样本的样本类型。Step S306: The server determines a sample type of the sample.
如图3A所示,可以人工或由所述服务器中的离线模型建立模块自动确定该样本(即第一通话事件)的样本类型,确定该样本是正样本还是负样本。As shown in FIG. 3A, the sample type of the sample (ie, the first call event) can be automatically determined manually or by an offline model building module in the server to determine whether the sample is a positive sample or a negative sample.
步骤S307、服务器根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;判断所述训练结果是否符合所述样本的样本类型;如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,获得所述训练结果满足所述样本的样本类型的训练模型。Step S307: The server obtains a training result output by the training model according to the feature parameter of the sample and the set training model, where the training model uses the feature parameter as a classification parameter; and determines whether the training result meets the sample. a sample type; if the training result does not satisfy the sample type of the sample, adjusting a model parameter of the training model until the training result satisfies a sample type of the sample, and obtaining the training result that satisfies the sample A training model for the sample type.
步骤S308、服务器将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型。Step S308: The server uses the training model whose training result satisfies the sample type of the sample as a preset recognition model.
图3A中所示的离线模型即为满足所述样本的样本类型的训练模型,这样离线模型建立模块可以不断从其所在的服务器中获取通话事件的特征参数,并将该通话事件的特征参数作为样本供机器学习进行模型训练,根据通话语音和通话行为的变换自动进行模型参数的调整,自动进化,避免基于规则的人工频繁介入调整参数。The offline model shown in FIG. 3A is a training model that satisfies the sample type of the sample, so that the offline model building module can continuously obtain the feature parameters of the call event from the server in which it is located, and take the feature parameters of the call event as The sample is used for machine learning to carry out model training. The model parameters are automatically adjusted according to the change of call voice and call behavior, and the evolution is automatically performed to avoid the rule-based manual frequent intervention adjustment parameters.
在本发明的其他实施例中,服务器中当前预设的识别模型也可以是由 第一设备采用前述实施例中所述的建立识别模型方法建立后,发送给所述服务器的;即服务器中的离线模型建立模块是设置在第一设备中的,所述第一设备是能与服务器进行通信的其他设备(可以是第一终端或第二终端)。所述步骤S305至步骤S308也可以在第一设备中实现,此时步骤S305包括:服务器将所述第一通话事件的特征参数发送给第一设备,所述第一设备将所述第一通话事件的特征参数作为所述样本的特征参数,然后执行步骤S306和S307,将训练结果满足所述样本的样本类型的训练模型作为当前预设的识别模型输出给服务器。In other embodiments of the present invention, the currently preset recognition model in the server may also be The first device is configured to be sent to the server by using the method for establishing a recognition model described in the foregoing embodiment; that is, the offline model establishing module in the server is disposed in the first device, and the first device is capable of Other devices (which may be the first terminal or the second terminal) that the server communicates with. The step S305 to the step S308 can also be implemented in the first device. In this case, the step S305 includes: the server sending the feature parameter of the first call event to the first device, where the first device sends the first call The feature parameter of the event is used as the feature parameter of the sample, and then steps S306 and S307 are performed, and the training model whose training result satisfies the sample type of the sample is output to the server as the current preset recognition model.
示例地,所述第一设备为第一终端11,如图3B所示,第一终端11采用前述实施例中所述的建立识别模型方法建立识别模型后,发送给所述服务器13;服务器对第一通话事件进行特征提取并根据该识别模型进行恶意电话识别后,将所述第一通话事件的识别结果发送给第一终端和/或第二终端。当然,所述服务器13在对第一通话事件进行特征提取后也会将所述第一通话事件的特征参数发送给第一终端11,所述第一终端11可以将所述第一通话事件的特征参数作为所述样本的特征参数来训练建立当前的识别模型,然后将更新的识别模型发送给服务器。For example, the first device is the first terminal 11. As shown in FIG. 3B, the first terminal 11 establishes a recognition model by using the method for establishing a recognition model described in the foregoing embodiment, and then sends the identifier to the server 13; After the first call event performs feature extraction and performs malicious phone identification according to the recognition model, the identification result of the first call event is sent to the first terminal and/or the second terminal. Of course, the server 13 sends the feature parameters of the first call event to the first terminal 11 after the feature extraction of the first call event, and the first terminal 11 may The feature parameter is trained as a feature parameter of the sample to establish a current recognition model, and then the updated recognition model is sent to the server.
示例地,所述第一设备不是第一终端和第二终端,而是其他能够与所述服务器13通信的设备,此时如图3C所示,第一设备14采用前述实施例中所述的建立识别模型方法建立识别模型后,发送给所述服务器13;服务器13采用本实施例中的方案进行恶意电话识别后,将所述第一通话事件的识别结果发送给第一终端11和/或第二终端12。当然,所述服务器13在对第一通话事件进行特征提取后也会将所述第一通话事件的特征参数发送给第一设备14,所述第一设备14可以将所述第一通话事件的特征参数作为所述样本的特征参数来训练建立当前的识别模型。For example, the first device is not the first terminal and the second terminal, but other devices capable of communicating with the server 13. At this time, as shown in FIG. 3C, the first device 14 adopts the foregoing embodiment. After the identification model is established, the identification model is established, and then sent to the server 13; after the server 13 uses the scheme in this embodiment to identify the malicious phone, the identification result of the first call event is sent to the first terminal 11 and/or The second terminal 12. Of course, the server 13 sends the feature parameters of the first call event to the first device 14 after the feature extraction of the first call event, and the first device 14 may The feature parameter is used as a feature parameter of the sample to train to establish a current recognition model.
在本发明的其他实施例中,所述计算设备也可以实现为第一终端,此 时,步骤S304需要替换为:所述第一终端在所述第一终端的显示界面上显示所述第一通话事件的识别结果。当然,所述计算设备也可以实现为第二终端,其实现过程与第一终端相同;示例地,所述计算设备为第一终端11,如图3D所示,所述服务器采用前述实施例中所述的建立识别模型方法建立识别模型后,将识别模型发送给第一终端11,所述第一终端11提取第一通话事件的特征参数,根据所述识别模型进行恶意电话识别,并在所述第一终端的显示界面上显示所述第一通话事件的识别结果。当然,所述第一终端11在对第一通话事件进行特征提取后也会将所述第一通话事件的特征参数发送给服务器13,所述服务器13可以将所述第一通话事件的特征参数作为所述样本的特征参数来训练建立当前的识别模型。In other embodiments of the present invention, the computing device may also be implemented as a first terminal, where The step S304 needs to be replaced by: the first terminal displays the recognition result of the first call event on the display interface of the first terminal. Of course, the computing device can also be implemented as a second terminal, and the implementation process is the same as that of the first terminal; for example, the computing device is the first terminal 11, as shown in FIG. 3D, the server adopts the foregoing embodiment. After the identification model is established, the identification model is sent to the first terminal 11, and the first terminal 11 extracts the feature parameters of the first call event, and performs malicious phone identification according to the recognition model. The recognition result of the first call event is displayed on the display interface of the first terminal. Of course, the first terminal 11 sends the feature parameters of the first call event to the server 13 after performing feature extraction on the first call event, and the server 13 may set the feature parameters of the first call event. Training is established as a feature parameter of the sample to establish a current recognition model.
本发明实施例中,先通过通话行为特征来做初步识别,初步识别结果满足第一预设条件时,再采用满足第一预设条件的通话事件的第二特征参数来进行识别,如此,可以预先筛除不满足第一预设条件的部分通话事件,可以加快识别速率,且最终对恶意通话事件的识别必然是采用描述语音特征的第二特征参数来识别的,保证了识别恶意通话事件的准确性。In the embodiment of the present invention, the initial recognition is performed by using the call behavior feature, and when the preliminary recognition result satisfies the first preset condition, the second feature parameter of the call event that satisfies the first preset condition is used for identification, so that Pre-screening part of the call event that does not satisfy the first preset condition can speed up the recognition rate, and finally the identification of the malicious call event must be identified by using the second feature parameter describing the voice feature to ensure the identification of the malicious call event. accuracy.
基于前述的实施例,本发明实施例提供一种识别恶意电话的装置,该识别恶意电话的装置所包括的各单元,以及各单元所包括的各模块,都可以通过该装置中的处理器来实现,当然也可通过逻辑电路实现;在实施例的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。Based on the foregoing embodiments, an embodiment of the present invention provides an apparatus for identifying a malicious phone. Each unit included in the device for identifying a malicious phone, and each module included in each unit may be processed by a processor in the device. The implementation can of course also be implemented by logic circuits; in the process of the embodiment, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP) or a field programmable gate array (FPGA). )Wait.
图4为本发明实施例识别恶意电话的装置的组成结构示意图,如图4所示,该装置包括第一获取单元401、识别单元402、第二获取单元403、第一输出单元404,其中:4 is a schematic structural diagram of a device for identifying a malicious phone according to an embodiment of the present invention. As shown in FIG. 4, the device includes a first acquiring unit 401, an identifying unit 402, a second obtaining unit 403, and a first output unit 404, where:
所述第一获取单元401,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用 于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个。The first obtaining unit 401 is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes The parameter describing the call voice feature, wherein the parameter describing the call voice feature comprises: at least one of a waveform feature parameter of the call voice, a number of first keywords in the text corresponding to the call voice, and a probability.
所述识别单元402,配置为根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数。The identifying unit 402 is configured to identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, and the recognition model uses the feature parameter as a classification parameter.
所述第二获取单元403,配置为获取所述识别模型识别出的所述第一通话事件的识别结果。The second obtaining unit 403 is configured to acquire a recognition result of the first call event identified by the identification model.
所述第一输出单元404,配置为输出所述第一通话事件的识别结果。The first output unit 404 is configured to output a recognition result of the first call event.
所述第一获取单元401包括:获取模块和提取模块,其中所述获取模块,配置为获取所述第一通话事件的通话语音信息;所述提取模块,配置为从所述第一通话事件的通话语音信息中提取所述特征参数。The first obtaining unit 401 includes: an obtaining module and an extracting module, wherein the acquiring module is configured to acquire call voice information of the first call event; and the extracting module is configured to be from the first call event The feature parameters are extracted from the call voice information.
所述提取模块,配置为从所述第一通话事件的通话语音信息中提取通话语音的波形,所述波形包括时域波形或频域波形;提取所述波形的波形特征参数,所述波形特征参数包括波峰幅度值、波谷幅度值、波形幅度平均值、波峰位置和波谷位置中的至少一个。The extracting module is configured to extract a waveform of the call voice from the call voice information of the first call event, where the waveform includes a time domain waveform or a frequency domain waveform; and extract waveform characteristic parameters of the waveform, the waveform feature The parameters include at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a valley position.
所述提取模块,配置为对所述第一通话事件的通话语音信息进行语音识别,获得通话语音对应的文本;提取所述文本中的文本关键字;比较所述文本关键字与预设的第一关键字,确定所述文本关键字中所述第一关键字的个数或概率。The extracting module is configured to perform voice recognition on the call voice information of the first call event, obtain text corresponding to the call voice, extract a text keyword in the text, and compare the text keyword with a preset number a keyword determining a number or probability of the first keyword in the text keyword.
所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述第一用户的通话行为特征的第二特征参数。The parameter for describing a call voice feature is a first feature parameter, and the feature parameter further includes a second feature parameter for describing a call behavior feature of the first user.
在本发明的其他实施例中,所述装置还包括采集单元和第三确定单元,其中,所述采集单元,配置为采集所述第一用户和所述第二用户在第一预设时间段内的第一通话行为;所述第三确定单元,配置为根据所述第一用 户和第二用户在第一预设时间段内的第一通话行为,确定所述第一用户是否为可疑用户;相应地,如果所述第一用户不是可疑用户,所述第二特征参数包括:在第二预设时间段内,与标记为恶意用户的通话次数和通话平均时长、与陌生用户的通话次数、与海外用户的通话次数中的至少一个;如果所述第一用户是可疑用户,所述第二特征参数包括:在第三预设时间段内,与陌生用户的通话次数和通话平均时长、与海外用户的通话次数中的至少一个。In another embodiment of the present invention, the device further includes an acquisition unit and a third determining unit, wherein the collecting unit is configured to collect the first user and the second user for a first preset time period a first call behavior; the third determining unit configured to be used according to the first Determining, by the first call behavior of the user and the second user in the first preset time period, whether the first user is a suspicious user; correspondingly, if the first user is not a suspicious user, the second feature parameter includes : at least one of a number of calls marked as a malicious user and an average duration of the call, a number of calls to the unfamiliar user, and a number of calls with the overseas user during the second predetermined time period; if the first user is a suspicious user The second characteristic parameter includes: at least one of a number of conversations with an unfamiliar user, an average duration of the call, and a number of conversations with the overseas user in the third preset time period.
在本发明的其他实施例中,所述识别模型包括第一子识别模型和第二子识别模型,则所述识别单元包括第一识别模块和第二识别模块,其中,所述第一识别模块,配置为根据所述第二特征参数和所述第一子识别模型,对所述第一通话事件进行识别,获取所述第一子识别模型识别出的所述第一通话事件的初始识别结果;所述第一子识别模型以所述第二特征参数为分类参数;所述第二识别模块,配置为在所述初始识别结果满足第一预设条件时,根据所述第一通话事件的特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述特征参数为分类参数;或者,根据所述第一通话事件的第一特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述第一特征参数为分类参数;相应地,所述第二获取模块403,配置为获取所述第二子识别模型识别出的所述第一通话事件的识别结果。In other embodiments of the present invention, the identification model includes a first sub-identification model and a second sub-recognition model, and the identification unit includes a first identification module and a second identification module, wherein the first identification module And configured to identify the first call event according to the second feature parameter and the first sub-identification model, and obtain an initial recognition result of the first call event identified by the first sub-recognition model The first sub-identification model is configured to use the second feature parameter as a classification parameter, and the second identification module is configured to: when the initial recognition result satisfies a first preset condition, according to the first call event The feature parameter and the second sub-recognition model identify the first call event, the second sub-recognition model uses the feature parameter as a classification parameter; or, according to the first feature of the first call event The parameter and the second sub-identification model identify the first call event, and the second sub-recognition model uses the first feature parameter as a classification parameter; correspondingly, the first Obtaining module 403, the second sub-configuration to obtain the recognition result of the first call event recognition model identified.
在本发明的其他实施例中,所述装置还包括:第一确定单元,其中,所述第一确定单元,配置为根据所述第一通话事件的识别结果确定所述第一通话事件的提醒指令;相应地,所述第一输出单元,还配置为根据所述第一通话事件的提醒指令向终端输出所述第一通话事件的识别结果,所述终端包括对应第一用户的第一终端和对应第二用户的第二终端。In another embodiment of the present invention, the device further includes: a first determining unit, wherein the first determining unit is configured to determine a reminder of the first call event according to the recognition result of the first call event The first output unit is further configured to output a recognition result of the first call event to the terminal according to the reminding instruction of the first call event, where the terminal includes a first terminal corresponding to the first user And a second terminal corresponding to the second user.
所述第一确定单元,配置为在所述第一通话事件的识别结果满足第二 预设条件时,确定所述第一通话事件的提醒指令为第一提醒指令,所述第一提醒指令用于指示不向终端输出所述第一通话事件的识别结果;在所述第一通话事件的识别结果满足第三预设条件时,确定所述第一通话事件的提醒指令为第二提醒指令,所述第二提醒指令用于指示向所述终端发送短信,所述短信中携带有所述第一通话事件的识别结果;在所述第一通话事件的识别结果满足第四预设条件时,确定所述第一通话事件的提醒指令为第三提醒指令,所述第三提醒指令用于指示向所述终端发起通话,并在所述终端接听后向所述第一终端通知所述第一通话事件的识别结果;所述终端包括对应第一用户的第一终端和对应第二用户的第二终端;相应地,所述第一输出单元404,配置为根据所述第一通话事件的提醒指令向终端输出所述第一通话事件的识别结果。The first determining unit is configured to satisfy a second recognition result of the first call event Determining, by the preset condition, that the reminding instruction of the first call event is a first reminding instruction, where the first reminding instruction is used to indicate that the recognition result of the first call event is not output to the terminal; When the recognition result of the event meets the third preset condition, the reminder instruction of the first call event is determined as a second reminder instruction, and the second reminder instruction is used to send a short message to the terminal, where the short message carries a recognition result of the first call event; when the recognition result of the first call event satisfies a fourth preset condition, determining that the alert command of the first call event is a third alert command, the third alert command Instructing to initiate a call to the terminal, and notifying the first terminal of the recognition result of the first call event after the terminal answers the call; the terminal includes a first terminal corresponding to the first user and a corresponding second a second terminal of the user; correspondingly, the first output unit 404 is configured to output a recognition result of the first call event to the terminal according to the reminding instruction of the first call event
在本发明的其他实施例中,所述第一输出单元404,还配置为在所述第一终端的显示界面上显示所述第一通话事件的识别结果,所述第一终端对应所述第一用户。在本发明的其他实施例中,所述装置还包括第三输出单元,其中,所述第三输出单元,配置为将所述第一通话事件的特征参数发送给第一设备。In another embodiment of the present invention, the first output unit 404 is further configured to display a recognition result of the first call event on a display interface of the first terminal, where the first terminal corresponds to the first One user. In other embodiments of the present invention, the apparatus further includes a third output unit, wherein the third output unit is configured to transmit the characteristic parameter of the first call event to the first device.
需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本发明装置实施例中未披露的技术细节,请参照本发明方法实施例的描述而理解。It should be noted that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar advantages as the method embodiment. For technical details not disclosed in the device embodiments of the present invention, please refer to the description of the method embodiments of the present invention.
基于前述的实施例,本发明实施例提供一种建立恶意模型的装置,该建立恶意模型的装置所包括的各单元,以及各单元所包括的各模块,都可以通过该装置中的处理器来实现,当然也可通过逻辑电路实现;在实施例的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。Based on the foregoing embodiments, an embodiment of the present invention provides an apparatus for establishing a malicious model, where each unit included in the apparatus for establishing a malicious model, and each module included in each unit can be processed by a processor in the apparatus. The implementation can of course also be implemented by logic circuits; in the process of the embodiment, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP) or a field programmable gate array (FPGA). )Wait.
图5为本发明实施例建立恶意模型的装置的组成结构示意图,如图5 所示,该建立恶意模型的装置包括:第二确定单元501、第三获取单元502、训练单元503、判断单元504、调整单元505、第二输出单元506,其中:FIG. 5 is a schematic structural diagram of a device for establishing a malicious model according to an embodiment of the present invention, as shown in FIG. 5 The device for establishing a malicious model includes: a second determining unit 501, a third obtaining unit 502, a training unit 503, a determining unit 504, an adjusting unit 505, and a second output unit 506, wherein:
所述第二确定单元501,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本。The second determining unit 501 is configured to determine a sample type of the sample, where the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is not belonging to a malicious phone. sample.
所述第三获取单元502,配置为获取样本的特征参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个。The third obtaining unit 502 is configured to acquire a feature parameter of the sample, where the parameter describing the voice feature of the call includes: a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability At least one of them.
所述训练单元503,配置为根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数。The training unit 503 is configured to obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter.
所述判断单元504,配置为判断所述训练结果是否符合所述样本的样本类型。The determining unit 504 is configured to determine whether the training result meets the sample type of the sample.
所述调整单元505,配置为在所述训练结果不满足所述样本的样本类型时,调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型。The adjusting unit 505 is configured to adjust a model parameter of the training model until the training result satisfies the sample type of the sample when the training result does not satisfy the sample type of the sample.
所述第二输出单元506,配置为将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second output unit 506 is configured to output, as a preset recognition model, a training model in which the training result satisfies the sample type of the sample.
在本发明实施例中,所述第一获取单元,还配置为接收第一通话事件的特征参数,将所述第一通话事件的特征参数作为所述样本的特征参数。In the embodiment of the present invention, the first acquiring unit is further configured to receive a feature parameter of the first call event, and use a feature parameter of the first call event as a feature parameter of the sample.
需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本发明装置实施例中未披露的技术细节,请参照本发明方法实施例的描述而理解。It should be noted that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar advantages as the method embodiment. For technical details not disclosed in the device embodiments of the present invention, please refer to the description of the method embodiments of the present invention.
基于前述的实施例,本发明实施例提供一种识别恶意电话的设备,该设备可以实现为服务器,图6为本发明实施例服务器的组成结构示意图, 如图6所示,该识别恶意电话的设备包括和第一处理器601和第一外部通信接口602,其中:Based on the foregoing embodiments, the embodiment of the present invention provides a device for identifying a malicious phone, and the device may be implemented as a server. FIG. 6 is a schematic structural diagram of a server according to an embodiment of the present invention. As shown in FIG. 6, the device for identifying a malicious phone includes a first processor 601 and a first external communication interface 602, wherein:
所述第一处理器601,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;获取所述识别模型识别出的所述第一通话事件的识别结果;通过所述第一外部通信接口602输出所述第一通话事件的识别结果。The first processor 601 is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing the call voice. The parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability; according to the first call event Identifying the first call event by using a feature parameter and a current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model Transmitting, by the first external communication interface 602, a recognition result of the first call event.
所述识别恶意电话的设备还可以实现为第一终端或第二终端,此时,所述识别恶意电话的设备包括第一处理器和显示屏,其中:所述第一处理器,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数;根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;获取所述识别模型识别出的所述第一通话事件的识别结果;通过所述显示屏显示所述第一通话事件的识别结果。所述显示屏用于显示所述第一通话事件的识别结果。The device for identifying a malicious phone may also be implemented as a first terminal or a second terminal. In this case, the device for identifying a malicious phone includes a first processor and a display screen, wherein: the first processor is configured to acquire a characteristic parameter of the first call event, the first call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature; according to the first call event Identifying the first call event by using a feature parameter and a current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model Displaying the recognition result of the first call event through the display screen. The display screen is configured to display a recognition result of the first call event.
需要指出的是:以上设备实施例项的描述,与上述方法描述是类似的,具有同方法实施例相同的有益效果。对于本发明设备实施例中未披露的技术细节,本领域的技术人员请参照本发明方法实施例的描述而理解。It should be noted that the description of the above device embodiment items is similar to the above method description, and has the same beneficial effects as the method embodiments. For technical details not disclosed in the device embodiments of the present invention, those skilled in the art will understand with reference to the description of the method embodiments of the present invention.
基于前述的实施例,本发明实施例提供一种建立恶意模型的设备,所述建立恶意模型的设备可以实现为服务器、第一终端或第二终端,图7为本发明实施例建立恶意模型的设备的组成结构示意图,如图7所示,该设 备包括第二处理器701和第二外部通信接口702,其中:Based on the foregoing embodiments, an embodiment of the present invention provides a device for establishing a malicious model, where the device for establishing a malicious model may be implemented as a server, a first terminal, or a second terminal, and FIG. 7 is a method for establishing a malicious model according to an embodiment of the present invention. Schematic diagram of the structure of the device, as shown in Figure 7, the design A second processor 701 and a second external communication interface 702 are included, wherein:
所述第二处理器701,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;获取样本的特征参数,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;判断所述训练结果是否符合所述样本的样本类型;如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,通过所述第二外部通信接口702将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second processor 701 is configured to determine a sample type of the sample, where the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is not belonging to a malicious phone. a sample; a feature parameter of the sample, the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first key in the text corresponding to the call voice At least one of a number of words and a probability; obtaining, according to a characteristic parameter of the sample and a set training model, a training result output by the training model, wherein the training model uses the feature parameter as a classification parameter; Whether the training result conforms to the sample type of the sample; if the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, The second external communication interface 702 makes the training model that the training result satisfies the sample type of the sample Pre-recognition model output.
需要指出的是:以上设备实施例项的描述,与上述方法描述是类似的,具有同方法实施例相同的有益效果。对于本发明设备实施例中未披露的技术细节,本领域的技术人员请参照本发明方法实施例的描述而理解。It should be noted that the description of the above device embodiment items is similar to the above method description, and has the same beneficial effects as the method embodiments. For technical details not disclosed in the device embodiments of the present invention, those skilled in the art will understand with reference to the description of the method embodiments of the present invention.
需要说明的是,本发明实施例中,如果以软件功能模块的形式实现上述的识别恶意电话及建立识别模型的方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。 It should be noted that, in the embodiment of the present invention, if the foregoing method for identifying a malicious phone and establishing a recognition model is implemented in the form of a software function module, and is sold or used as an independent product, it may also be stored in a computer readable state. In the storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions. A computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
相应地,本发明实施例再提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行本发明实施例中识别恶意电话及建立识别模型的方法。Correspondingly, the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method for identifying a malicious phone and establishing a recognition model in the embodiment of the present invention. .
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本发明的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个......”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It is to be understood that the phrase "one embodiment" or "an embodiment" or "an" Thus, "in one embodiment" or "in an embodiment" or "an" In addition, these particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present invention, the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be directed to the embodiments of the present invention. The implementation process constitutes any limitation. The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments. It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes those elements. It also includes other elements that are not explicitly listed, or elements that are inherent to such a process, method, article, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed. In addition, the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的, 作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。The units described above as separate components may or may not be physically separated. The components displayed as the unit may be, or may not be, physical units; they may be located in one place or on multiple network units; some or all of the units may be selected according to actual needs to implement the solution of the embodiment. purpose. In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; The unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。It will be understood by those skilled in the art that all or part of the steps of implementing the foregoing method embodiments may be performed by hardware related to program instructions. The foregoing program may be stored in a computer readable storage medium, and when executed, the program includes The foregoing steps of the method embodiment; and the foregoing storage medium includes: a removable storage device, a read only memory (ROM), a magnetic disk, or an optical disk, and the like, which can store program codes. Alternatively, the above-described integrated unit of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a standalone product. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions. A computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention. The foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a magnetic disk, or an optical disk.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。 The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.
工业实用性Industrial applicability
本发明实施例中,采用描述通话语音特征的参数作为识别标准,由于恶意用户在进行推销和诈骗等恶意通话时的语气和用语不会随意改变,这样就能够准确识别出恶意的通话事件,并输出识别结果来提醒用户免受诈骗,可以极大的减少用户的经济损失;另外,所述识别模型的建立需要不断地对训练模型进行训练,根据训练结果不断调整训练模型的模型参数,使最终的训练模型对样本识别的准召率达到最优,如此提升识别恶意电话的准确性。 In the embodiment of the present invention, the parameter describing the characteristics of the call voice is used as the identification standard, and the tone and the term of the malicious user during the malicious call such as promotion and fraud are not arbitrarily changed, so that the malicious call event can be accurately identified, and Outputting the recognition result to remind the user from fraud, can greatly reduce the economic loss of the user; in addition, the establishment of the recognition model needs to continuously train the training model, and continuously adjust the model parameters of the training model according to the training result, so as to finally The training model optimizes the call rate for sample identification, thus improving the accuracy of identifying malicious calls.

Claims (23)

  1. 一种识别恶意电话的方法,所述方法包括:A method of identifying a malicious phone, the method comprising:
    获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;Obtaining a feature parameter of the first call event, the first call event is a call event between the first user and the second user, and the feature parameter includes a parameter for describing a call voice feature, wherein the description call voice The parameter of the feature includes: a waveform feature parameter of the call voice, at least one of a number of the first keyword and a probability in the text corresponding to the call voice;
    根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;Determining, according to the feature parameter of the first call event and the current preset recognition model, the first call event, wherein the identification model uses the feature parameter as a classification parameter;
    获取所述识别模型识别出的所述第一通话事件的识别结果;Obtaining a recognition result of the first call event identified by the identification model;
    输出所述第一通话事件的识别结果。The recognition result of the first call event is output.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1 wherein the method further comprises:
    确定第一通话事件;Determining the first call event;
    所述获取第一通话事件的特征参数包括:提取所述第一通话事件的特征参数。The acquiring the feature parameter of the first call event includes: extracting a feature parameter of the first call event.
  3. 根据权利要求1所述的方法,其中,所述获取第一通话事件的特征参数包括:The method of claim 1, wherein the obtaining the characteristic parameters of the first call event comprises:
    第一终端接收服务器发送的第一所述通话事件的特征参数,所述第一终端对应第一用户。The first terminal receives a feature parameter of the first call event sent by the server, where the first terminal corresponds to the first user.
  4. 根据权利要求1所述的方法,其中,所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述第一用户的通话行为特征的第二特征参数。The method of claim 1, wherein the parameter for describing a call speech feature is a first feature parameter, the feature parameter further comprising a second feature parameter for describing a call behavior feature of the first user.
  5. 根据权利要求4所述的方法,其中,所述识别模型包括第一子识别模型和第二子识别模型,则所述根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,包括:The method according to claim 4, wherein the recognition model comprises a first sub-recognition model and a second sub-recognition model, wherein the feature parameter according to the first call event and the current preset recognition model are The first call event is identified, including:
    根据所述第二特征参数和所述第一子识别模型,对所述第一通话事 件进行识别,所述第一子识别模型以所述第二特征参数为分类参数;Determining the first call according to the second feature parameter and the first sub-identification model Identifying, the first sub-identification model taking the second characteristic parameter as a classification parameter;
    获取所述第一子识别模型识别出的所述第一通话事件的初始识别结果;Obtaining an initial recognition result of the first call event identified by the first sub-identification model;
    在所述初始识别结果满足第一预设条件时,获取所述第一通话事件的第一特征参数;Acquiring the first feature parameter of the first call event when the initial recognition result meets the first preset condition;
    根据所述第一通话事件的特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述特征参数为分类参数;或者,根据所述第一通话事件的第一特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述第一特征参数为分类参数;Determining, according to the feature parameter of the first call event and the second sub-identification model, the first call event, wherein the second sub-recognition model uses the feature parameter as a classification parameter; or, according to the The first feature parameter of the first call event and the second sub-identification model identify the first call event, and the second sub-identification model uses the first feature parameter as a classification parameter;
    相应地,所述获取所述识别模型识别出的所述第一通话事件的识别结果,包括:获取所述第二子识别模型识别出的所述第一通话事件的识别结果。Correspondingly, the obtaining the identification result of the first call event identified by the recognition model comprises: acquiring a recognition result of the first call event identified by the second sub-recognition model.
  6. 根据权利要求2所述的方法,其中,所述提取所述第一通话事件的特征参数包括:The method of claim 2, wherein the extracting characteristic parameters of the first call event comprises:
    获取所述第一通话事件的通话语音信息;Obtaining call voice information of the first call event;
    从所述第一通话事件的通话语音信息中提取所述特征参数。Extracting the feature parameter from the call voice information of the first call event.
  7. 根据权利要求6所述的方法,其中,所述从所述第一通话事件的通话语音信息中提取所述特征参数,包括:The method of claim 6, wherein the extracting the feature parameters from the call voice information of the first call event comprises:
    从所述第一通话事件的通话语音信息中提取通话语音的波形,所述波形包括时域波形或频域波形;Extracting a waveform of the call voice from the call voice information of the first call event, where the waveform includes a time domain waveform or a frequency domain waveform;
    提取所述波形的波形特征参数,所述波形特征参数包括波峰幅度值、波谷幅度值、波形幅度平均值、波峰位置和波谷位置中的至少一个。A waveform characteristic parameter of the waveform is extracted, the waveform characteristic parameter including at least one of a peak amplitude value, a valley amplitude value, a waveform amplitude average, a peak position, and a trough position.
  8. 根据权利要求6所述的方法,其中,所述从所述第一通话事件的通话语音信息中提取所述特征参数,包括: The method of claim 6, wherein the extracting the feature parameters from the call voice information of the first call event comprises:
    对所述第一通话事件的通话语音信息进行语音识别,获得通话语音对应的文本;Performing voice recognition on the call voice information of the first call event, and obtaining a text corresponding to the call voice;
    提取所述文本中的文本关键字;Extracting text keywords in the text;
    比较所述文本关键字与预设的第一关键字,确定所述文本关键字中所述第一关键字的个数或概率。Comparing the text keyword with the preset first keyword, determining the number or probability of the first keyword in the text keyword.
  9. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4 wherein the method further comprises:
    采集所述第一用户和所述第二用户在第一预设时间段内的第一通话行为;Collecting, by the first user and the second user, a first call behavior in a first preset time period;
    根据所述第一用户和第二用户在第一预设时间段内的第一通话行为,确定所述第一用户是否为可疑用户;Determining, according to the first call behavior of the first user and the second user in the first preset time period, whether the first user is a suspicious user;
    相应地,如果所述第一用户不是可疑用户,所述第二特征参数包括:在第二预设时间段内,与标记为恶意用户的通话次数和通话平均时长、与陌生用户的通话次数、与海外用户的通话次数中的至少一个;如果所述第一用户是可疑用户,所述第二特征参数包括:在第三预设时间段内,与陌生用户的通话次数和通话平均时长、与海外用户的通话次数中的至少一个。Correspondingly, if the first user is not a suspicious user, the second feature parameter includes: the number of calls marked as a malicious user, the average duration of the call, the number of calls with the unfamiliar user, and the number of calls to the unfamiliar user during the second preset time period, At least one of the number of calls with the overseas user; if the first user is a suspicious user, the second characteristic parameter includes: the number of calls with the unfamiliar user and the average duration of the call during the third preset time period, and At least one of the number of calls from overseas users.
  10. 根据权利要求9所述的方法,其中,所述第一用户不是可疑用户时,输出所述第一通话事件的识别结果之前,所述方法还包括:The method according to claim 9, wherein, before the first user is not a suspicious user, before the outputting the recognition result of the first call event, the method further comprises:
    服务器根据所述第一通话事件的识别结果确定所述第一通话事件的提醒指令;Determining, by the server, a reminder instruction of the first call event according to the recognition result of the first call event;
    相应地,所述输出所述识别结果包括:Correspondingly, the outputting the recognition result comprises:
    所述服务器根据所述第一通话事件的提醒指令向第一终端输出所述第一通话事件的识别结果,所述第一终端对应所述第一用户。The server outputs the identification result of the first call event to the first terminal according to the reminding instruction of the first call event, where the first terminal corresponds to the first user.
  11. 根据权利要求10所述的方法,其中,所述服务器根据所述第一通话事件的识别结果确定所述第一通话事件的提醒指令,包括: The method of claim 10, wherein the determining, by the server, the reminder instruction of the first call event according to the recognition result of the first call event comprises:
    服务器在所述第一通话事件的识别结果满足第二预设条件时,确定所述第一通话事件的提醒指令为第一提醒指令,所述第一提醒指令用于指示不向终端输出所述第一通话事件的识别结果;The server determines that the reminding instruction of the first call event is a first reminding instruction, and the first reminding instruction is used to indicate that the terminal is not outputting to the terminal, when the recognition result of the first call event meets the second preset condition The recognition result of the first call event;
    服务器在所述第一通话事件的识别结果满足第三预设条件时,确定所述第一通话事件的提醒指令为第二提醒指令,所述第二提醒指令用于指示向所述第一终端发送短信,所述短信中携带有所述第一通话事件的识别结果;The server determines that the reminder instruction of the first call event is a second reminder instruction, and the second reminder instruction is used to indicate to the first terminal, when the recognition result of the first call event meets a third preset condition Sending a short message, where the short message carries the identification result of the first call event;
    服务器在所述第一通话事件的识别结果满足第四预设条件时,确定所述第一通话事件的提醒指令为第三提醒指令,所述第三提醒指令用于指示向所述第一终端发起通话,并在所述第一终端接听后向所述第一终端通知所述第一通话事件的识别结果。The server determines that the reminder instruction of the first call event is a third reminder instruction, and the third reminder instruction is used to indicate to the first terminal, when the recognition result of the first call event meets the fourth preset condition Initiating a call, and notifying the first terminal of the recognition result of the first call event after the first terminal answers the call.
  12. 根据权利要求1至11任一项所述的方法,其中,所述输出所述识别结果,包括:The method according to any one of claims 1 to 11, wherein said outputting said identification result comprises:
    第一终端在所述第一终端的显示界面上显示所述第一通话事件的识别结果,所述第一终端对应所述第一用户。The first terminal displays the identification result of the first call event on the display interface of the first terminal, where the first terminal corresponds to the first user.
  13. 根据权利要求1或2所述的方法,其中,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    将所述第一通话事件的特征参数发送给第一设备。Sending the characteristic parameter of the first call event to the first device.
  14. 根据权利要求1至11任一项所述的方法,其中,所述方法还包括:The method of any of claims 1 to 11, wherein the method further comprises:
    确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;Determining a sample type of the sample, the sample type comprising a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, the negative sample being a sample not belonging to a malicious phone;
    获取样本的特征参数;Obtaining characteristic parameters of the sample;
    根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;Obtaining, according to a characteristic parameter of the sample and a set training model, a training result output by the training model, where the training model uses the feature parameter as a classification parameter;
    判断所述训练结果是否符合所述样本的样本类型; Determining whether the training result meets the sample type of the sample;
    如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。If the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, and training the training result to satisfy the sample type of the sample The model is output as a preset recognition model.
  15. 根据权利要求14所述的方法,其中,所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述通话行为特征的第二特征参数;所述训练模型包括第一子训练模型和第二子训练模型,则所述方法还包括:The method of claim 14, wherein the parameter for describing a call speech feature is a first feature parameter, the feature parameter further comprising a second feature parameter for describing a call behavior feature; the training model comprising The first sub-training model and the second sub-training model, the method further includes:
    根据所述第二特征参数和所述第一子训练模型,对所述样本进行识别,所述第一子训练模型以所述第二特征参数为分类参数;获取所述第一子训练模型输出的所述样本的第一训练结果;在所述第一训练结果不满足所述样本的样本类型时,调整所述第一训练模型的模型参数直至所述第一训练结果满足所述样本的样本类型;Determining the sample according to the second feature parameter and the first sub-training model, wherein the first sub-training model uses the second feature parameter as a classification parameter; and acquiring the first sub-training model output a first training result of the sample; when the first training result does not satisfy the sample type of the sample, adjusting a model parameter of the first training model until the first training result satisfies a sample of the sample Types of;
    根据第三特征参数和所述第二子训练模型,对所述样本进行识别,所述第二子训练模型以所述第三特征参数为分类参数,所述第三特征参数为所述第二特征参数或所述特征参数;获取所述第二子训练模型输出的第二子训练结果;在所述第二子训练结果不满足所述样本的样本类型时,调整所述第二子训练模型的模型参数直至所述第二训练结果满足所述样本的样本类型;Identifying the sample according to the third feature parameter and the second sub-training model, wherein the second sub-training model uses the third feature parameter as a classification parameter, and the third feature parameter is the second And acquiring the second sub-training result output by the second sub-training model; adjusting the second sub-training model when the second sub-training result does not satisfy the sample type of the sample Model parameters until the second training result satisfies the sample type of the sample;
    将所述第一训练结果满足所述样本的样本类型的第一子训练模型作为预设的第一子识别模型输出,将所述第二训练结果满足所述样本的样本类型的第二子训练模型作为预设的第二子识别模型输出。And the first sub-training model that satisfies the sample type of the sample as the preset first sub-recognition model output, and the second training result satisfies the second sub-training of the sample type of the sample The model is output as a preset second sub-recognition model.
  16. 一种建立识别模型的方法,其中,所述方法包括:A method of establishing a recognition model, wherein the method comprises:
    确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;Determining a sample type of the sample, the sample type comprising a positive sample and a negative sample, the positive sample being a sample belonging to a malicious phone, the negative sample being a sample not belonging to a malicious phone;
    获取样本的特征参数,所述特征参数包括用于描述通话语音特征的 参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;Obtaining a feature parameter of the sample, the feature parameter including a feature for describing a call voice a parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability;
    根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;Obtaining, according to a characteristic parameter of the sample and a set training model, a training result output by the training model, where the training model uses the feature parameter as a classification parameter;
    判断所述训练结果是否符合所述样本的样本类型;Determining whether the training result meets the sample type of the sample;
    如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。If the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, and training the training result to satisfy the sample type of the sample The model is output as a preset recognition model.
  17. 一种识别恶意电话的装置,其中,所述装置包括:第一获取单元、识别单元、第二获取单元、第一输出单元,其中,An apparatus for identifying a malicious phone, wherein the device includes: a first acquiring unit, an identifying unit, a second acquiring unit, and a first output unit, wherein
    所述第一获取单元,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;The first acquiring unit is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing a call voice feature. a parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability;
    所述识别单元,配置为根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;The identifying unit is configured to identify the first call event according to the feature parameter of the first call event and the currently preset recognition model, where the recognition model uses the feature parameter as a classification parameter;
    所述第二获取单元,配置为获取所述识别模型识别出的所述第一通话事件的识别结果;The second acquiring unit is configured to acquire a recognition result of the first call event identified by the identification model;
    所述第一输出单元,配置为输出所述第一通话事件的识别结果。The first output unit is configured to output a recognition result of the first call event.
  18. 根据权利要求17所述的装置,其中,所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述第一用户的通话行为特征的第二特征参数,所述识别模型包括第一子识别模型和第二子识别模型,则所述识别单元包括第一识别模块和第二识别模块,其中, The apparatus of claim 17, wherein the parameter for describing a call voice feature is a first feature parameter, the feature parameter further comprising a second feature parameter for describing a call behavior feature of the first user, The identification model includes a first sub-identification model and a second sub-recognition model, and the identification unit includes a first identification module and a second identification module, wherein
    所述第一识别模块,配置为根据所述第二特征参数和所述第一子识别模型,对所述第一通话事件进行识别,获取所述第一子识别模型识别出的所述第一通话事件的初始识别结果;所述第一子识别模型以所述第二特征参数为分类参数;The first identification module is configured to identify the first call event according to the second feature parameter and the first sub-identification model, and acquire the first identifier identified by the first sub-identification model An initial recognition result of the call event; the first sub-identification model uses the second feature parameter as a classification parameter;
    所述第二识别模块,配置为在所述初始识别结果满足第一预设条件时,根据所述第一通话事件的特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述特征参数为分类参数;或者,根据所述第一通话事件的第一特征参数和所述第二子识别模型,对所述第一通话事件进行识别,所述第二子识别模型以所述第一特征参数为分类参数;The second identification module is configured to: when the initial recognition result meets the first preset condition, perform the first call event according to the feature parameter of the first call event and the second sub-recognition model Identifying that the second sub-identification model uses the feature parameter as a classification parameter; or, identifying the first call event according to the first feature parameter of the first call event and the second sub-identification model The second sub-identification model takes the first feature parameter as a classification parameter;
    相应地,所述第二获取单元,配置为获取所述第二子识别模型识别出的所述第一通话事件的识别结果。Correspondingly, the second acquiring unit is configured to acquire a recognition result of the first call event identified by the second sub-recognition model.
  19. 一种建立恶意模型的装置,所述装置包括:第二确定单元、第三获取单元、训练单元、判断单元、调整单元、第二输出单元,其中,An apparatus for establishing a malicious model, the apparatus comprising: a second determining unit, a third obtaining unit, a training unit, a determining unit, an adjusting unit, and a second output unit, wherein
    所述第二确定单元,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;The second determining unit is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone ;
    所述第三获取单元,配置为获取样本的特征参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;The third acquiring unit is configured to acquire a feature parameter of the sample, where the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability At least one;
    所述训练单元,配置为根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;The training unit is configured to obtain a training result output by the training model according to a characteristic parameter of the sample and a set training model, where the training model uses the feature parameter as a classification parameter;
    所述判断单元,配置为判断所述训练结果是否符合所述样本的样本类型; The determining unit is configured to determine whether the training result meets a sample type of the sample;
    所述调整单元,配置为在所述训练结果不满足所述样本的样本类型时,调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型;The adjusting unit is configured to adjust a model parameter of the training model until the training result satisfies a sample type of the sample when the training result does not satisfy a sample type of the sample;
    所述第二输出单元,配置为将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second output unit is configured to output the training model that the training result satisfies the sample type of the sample as a preset recognition model.
  20. 一种识别恶意电话的设备,所述设备包括第一处理器和第一外部通信接口,或者,所述设备包括第一处理器和显示屏;其中,A device for identifying a malicious phone, the device comprising a first processor and a first external communication interface, or the device comprising a first processor and a display screen; wherein
    所述第一处理器,配置为获取第一通话事件的特征参数,所述第一通话事件为第一用户与第二用户之间的通话事件,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述第一通话事件的特征参数和当前预设的识别模型,对所述第一通话事件进行识别,所述识别模型以所述特征参数为分类参数;获取所述识别模型识别出的所述第一通话事件的识别结果;通过所述第一外部通信接口输出所述第一通话事件的识别结果,或者通过所述显示屏显示所述第一通话事件的识别结果。The first processor is configured to acquire a feature parameter of the first call event, where the first call event is a call event between the first user and the second user, and the feature parameter includes a feature for describing a call voice feature. a parameter, wherein the parameter describing the voice feature of the call includes: at least one of a waveform feature parameter of the call voice, a number of the first keyword in the text corresponding to the call voice, and a probability; according to the feature of the first call event Identifying the first call event by using the parameter and the current preset recognition model, wherein the recognition model uses the feature parameter as a classification parameter; and acquiring a recognition result of the first call event identified by the recognition model; Outputting the recognition result of the first call event through the first external communication interface, or displaying the recognition result of the first call event through the display screen.
  21. 根据权利要求20所述的设备,其中,所述用于描述通话语音特征的参数为第一特征参数,所述特征参数还包括用于描述第一用户的通话行为特征的第二特征参数,所述识别模型包括第一子识别模型和第二子识别模型,则,The device according to claim 20, wherein the parameter for describing a call voice feature is a first feature parameter, and the feature parameter further comprises a second feature parameter for describing a call behavior feature of the first user, The recognition model includes a first sub-recognition model and a second sub-recognition model, then,
    所述第一处理器,配置为根据所述第二特征参数和所述第一子识别模型,对所述第一通话事件进行识别,获取所述第一子识别模型识别出的所述第一通话事件的初始识别结果;所述第一子识别模型以所述第二特征参数为分类参数;在所述初始识别结果满足第一预设条件时,根据所述第一通话事件的特征参数和所述第二子识别模型,对所述第一通话 事件进行识别,获取所述第二子识别模型识别出的所述第一通话事件的识别结果,所述第二子识别模型以所述特征参数为分类参数;或者,根据所述第一通话事件的第一特征参数和所述第二子识别模型,对所述第一通话事件进行识别,获取所述第二子识别模型识别出的所述第一通话事件的识别结果,所述第二子识别模型以所述第一特征参数为分类参数。The first processor is configured to identify the first call event according to the second feature parameter and the first sub-identification model, and acquire the first identifier identified by the first sub-recognition model An initial recognition result of the call event; the first sub-identification model uses the second feature parameter as a classification parameter; and when the initial recognition result satisfies a first preset condition, according to a characteristic parameter of the first call event The second sub-identification model, for the first call The event is identified, the recognition result of the first call event identified by the second sub-identification model is obtained, and the second sub-recognition model uses the feature parameter as a classification parameter; or, according to the first call event The first feature parameter and the second sub-identification model identify the first call event, and obtain a recognition result of the first call event identified by the second sub-recognition model, the second sub- The recognition model takes the first feature parameter as a classification parameter.
  22. 一种建立恶意模型的设备,所述设备包括:第二处理器和第二外部通信接口,其中,A device for establishing a malicious model, the device comprising: a second processor and a second external communication interface, wherein
    所述第二处理器,配置为确定所述样本的样本类型,所述样本类型包括正样本和负样本,所述正样本为属于恶意电话的样本,所述负样本为不属于恶意电话的样本;获取样本的特征参数,所述特征参数包括用于描述通话语音特征的参数,其中,所述描述通话语音特征的参数包括:通话语音的波形特征参数、通话语音对应的文本中第一关键字的个数和概率中的至少一个;根据所述样本的特征参数和设置的训练模型,得到所述训练模型输出的训练结果,所述训练模型以所述特征参数为分类参数;判断所述训练结果是否符合所述样本的样本类型;如果所述训练结果不满足所述样本的样本类型,则调整所述训练模型的模型参数直至所述训练结果满足所述样本的样本类型,通过所述第二外部通信接口将所述训练结果满足所述样本的样本类型的训练模型作为预设的识别模型输出。The second processor is configured to determine a sample type of the sample, the sample type includes a positive sample and a negative sample, the positive sample is a sample belonging to a malicious phone, and the negative sample is a sample not belonging to a malicious phone Obtaining a feature parameter of the sample, the feature parameter includes a parameter for describing a call voice feature, wherein the parameter describing the call voice feature includes: a waveform feature parameter of the call voice, and a first keyword in the text corresponding to the call voice At least one of a number and a probability; obtaining a training result output by the training model according to a characteristic parameter of the sample and a set training model, wherein the training model uses the feature parameter as a classification parameter; determining the training Whether the result conforms to the sample type of the sample; if the training result does not satisfy the sample type of the sample, adjusting the model parameter of the training model until the training result satisfies the sample type of the sample, The external communication interface uses the training model in which the training result satisfies the sample type of the sample as a pre- The recognition model output.
  23. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行权利要求1至15任一项所述的识别恶意电话的方法或16所述的建立识别模型的方法。 A computer storage medium having stored therein computer executable instructions for performing the method of identifying a malicious phone according to any one of claims 1 to 15 or establishing identification The method of the model.
PCT/CN2017/074169 2016-04-28 2017-02-20 Method, apparatus and device for identifying malicious call and establishing identification model WO2017185862A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610278825.9A CN107343077B (en) 2016-04-28 2016-04-28 Method, device and equipment for identifying malicious phone and establishing identification model
CN201610278825.9 2016-04-28

Publications (1)

Publication Number Publication Date
WO2017185862A1 true WO2017185862A1 (en) 2017-11-02

Family

ID=60160705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/074169 WO2017185862A1 (en) 2016-04-28 2017-02-20 Method, apparatus and device for identifying malicious call and establishing identification model

Country Status (2)

Country Link
CN (1) CN107343077B (en)
WO (1) WO2017185862A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110751553A (en) * 2019-10-24 2020-02-04 深圳前海微众银行股份有限公司 Identification method and device of potential risk object, terminal equipment and storage medium
CN111124698A (en) * 2018-10-30 2020-05-08 北京奇虎科技有限公司 Communication event identification method and device, electronic equipment and readable storage medium
CN111786802A (en) * 2019-04-03 2020-10-16 北京嘀嘀无限科技发展有限公司 Event detection method and device
CN111901554A (en) * 2020-07-27 2020-11-06 中国平安人寿保险股份有限公司 Call channel construction method and device based on semantic clustering and computer equipment
CN112035548A (en) * 2020-08-31 2020-12-04 北京嘀嘀无限科技发展有限公司 Identification model acquisition method, identification method, device, equipment and medium
CN113315876A (en) * 2021-05-27 2021-08-27 中国银行股份有限公司 Telephone bank service control method, device, server and storage medium
CN114125155A (en) * 2021-11-15 2022-03-01 天津市国瑞数码安全系统股份有限公司 Crank call detection method and system based on big data analysis

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995370B (en) * 2017-12-21 2020-11-24 Oppo广东移动通信有限公司 Call control method, device, storage medium and mobile terminal
CN107846493B (en) * 2017-12-21 2019-10-25 Oppo广东移动通信有限公司 Call contact person control method, device and storage medium and mobile terminal
CN108091324B (en) * 2017-12-22 2021-08-17 北京百度网讯科技有限公司 Tone recognition method and device, electronic equipment and computer-readable storage medium
CN110351731A (en) * 2018-04-08 2019-10-18 中兴通讯股份有限公司 A kind of method and device of phone number antifraud
CN110401780B (en) * 2018-04-25 2021-05-11 中国移动通信集团广东有限公司 Method and device for recognizing fraud calls
CN108833720B (en) * 2018-05-04 2021-11-30 北京邮电大学 Fraud telephone number identification method and system
CN109599116B (en) * 2018-10-08 2022-11-04 中国平安财产保险股份有限公司 Method and device for supervising insurance claims based on voice recognition and computer equipment
CN109688275A (en) * 2018-12-27 2019-04-26 中国联合网络通信集团有限公司 Harassing call recognition methods, device and storage medium
CN111343330A (en) * 2019-03-29 2020-06-26 阿里巴巴集团控股有限公司 Smart phone
CN110113473A (en) * 2019-05-10 2019-08-09 南京硅基智能科技有限公司 A kind of method and system of the intelligent filtering incoming call based on cloud virtual mobile phone
CN110177179B (en) * 2019-05-16 2020-12-29 国家计算机网络与信息安全管理中心 Fraud number identification method based on graph embedding
CN110519466A (en) * 2019-08-30 2019-11-29 北京泰迪熊移动科技有限公司 A kind of express delivery number identification method, equipment and computer storage medium
CN110602326B (en) * 2019-09-19 2021-06-04 中国联合网络通信集团有限公司 Suspicious incoming call identification method and suspicious incoming call identification system
CN111800546B (en) * 2020-07-07 2021-12-17 中国工商银行股份有限公司 Method, device and system for constructing recognition model and recognizing and electronic equipment
CN113792140A (en) * 2021-08-12 2021-12-14 南京星云数字技术有限公司 Text processing method and device and computer readable storage medium
CN114567697A (en) * 2022-03-01 2022-05-31 恒安嘉新(北京)科技股份公司 Abnormal telephone identification method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080300877A1 (en) * 2007-05-29 2008-12-04 At&T Corp. System and method for tracking fraudulent electronic transactions using voiceprints
US20100305960A1 (en) * 2005-04-21 2010-12-02 Victrio Method and system for enrolling a voiceprint in a fraudster database
CN103179122A (en) * 2013-03-22 2013-06-26 马博 Telcom phone phishing-resistant method and system based on discrimination and identification content analysis
CN104065836A (en) * 2014-05-30 2014-09-24 小米科技有限责任公司 Method and device for monitoring calls

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104735671B (en) * 2015-02-27 2018-11-09 腾讯科技(深圳)有限公司 A kind of method and apparatus of identification malicious call

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100305960A1 (en) * 2005-04-21 2010-12-02 Victrio Method and system for enrolling a voiceprint in a fraudster database
US20080300877A1 (en) * 2007-05-29 2008-12-04 At&T Corp. System and method for tracking fraudulent electronic transactions using voiceprints
CN103179122A (en) * 2013-03-22 2013-06-26 马博 Telcom phone phishing-resistant method and system based on discrimination and identification content analysis
CN104065836A (en) * 2014-05-30 2014-09-24 小米科技有限责任公司 Method and device for monitoring calls

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124698A (en) * 2018-10-30 2020-05-08 北京奇虎科技有限公司 Communication event identification method and device, electronic equipment and readable storage medium
CN111786802A (en) * 2019-04-03 2020-10-16 北京嘀嘀无限科技发展有限公司 Event detection method and device
CN111786802B (en) * 2019-04-03 2023-07-04 北京嘀嘀无限科技发展有限公司 Event detection method and device
CN110751553A (en) * 2019-10-24 2020-02-04 深圳前海微众银行股份有限公司 Identification method and device of potential risk object, terminal equipment and storage medium
CN111901554A (en) * 2020-07-27 2020-11-06 中国平安人寿保险股份有限公司 Call channel construction method and device based on semantic clustering and computer equipment
CN111901554B (en) * 2020-07-27 2022-11-11 中国平安人寿保险股份有限公司 Call channel construction method and device based on semantic clustering and computer equipment
CN112035548A (en) * 2020-08-31 2020-12-04 北京嘀嘀无限科技发展有限公司 Identification model acquisition method, identification method, device, equipment and medium
CN113315876A (en) * 2021-05-27 2021-08-27 中国银行股份有限公司 Telephone bank service control method, device, server and storage medium
CN113315876B (en) * 2021-05-27 2023-01-20 中国银行股份有限公司 Telephone bank service control method, device, server and storage medium
CN114125155A (en) * 2021-11-15 2022-03-01 天津市国瑞数码安全系统股份有限公司 Crank call detection method and system based on big data analysis

Also Published As

Publication number Publication date
CN107343077B (en) 2019-12-13
CN107343077A (en) 2017-11-10

Similar Documents

Publication Publication Date Title
WO2017185862A1 (en) Method, apparatus and device for identifying malicious call and establishing identification model
US10447838B2 (en) Telephone fraud management system and method
CN110891124B (en) System for artificial intelligence pick-up call
EP4027631A2 (en) Collaborative phone reputation system
WO2016000636A1 (en) Communications processing method and system
US20150181039A1 (en) Escalation detection and monitoring
US8774369B2 (en) Method and system to provide priority indicating calls
WO2020124453A1 (en) Automatic information reply method and related apparatus
CN105960674A (en) Information processing device
WO2017166464A1 (en) Information interaction method and terminal
US8953471B2 (en) Counteracting spam in voice over internet protocol telephony systems
WO2021184837A1 (en) Fraudulent call identification method and device, storage medium, and terminal
US9906643B2 (en) System, apparatus and method of providing phone call route information
WO2018210131A1 (en) Invitation behavior prediction method and apparatus, and storage medium
WO2017181615A1 (en) Method and device for processing unfamiliar incoming call, and mobile terminal
US9002333B1 (en) Mobile device reputation system
CN106156362A (en) A kind of method and device automatically providing solution for early warning
KR102638566B1 (en) Control of incoming calls based on call settings
US20190035392A1 (en) Real-Time Human Data Collection Using Voice and Messaging Side Channel
CN109831417A (en) Method, apparatus, server and the storage medium of anti-harassment processing account number
CN106161715B (en) A kind of communication control method, network server and electronic equipment
CN113055523A (en) Crank call interception method and device, electronic equipment and storage medium
CN114157763A (en) Information processing method and device in interactive process, terminal and storage medium
WO2015007156A1 (en) Information processing method, apparatus, and system
EP2693429A1 (en) System and method for analyzing voice communications

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17788518

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17788518

Country of ref document: EP

Kind code of ref document: A1