CN115103063A - Internet fraud identification method and device for counterfeit customer service class - Google Patents

Internet fraud identification method and device for counterfeit customer service class Download PDF

Info

Publication number
CN115103063A
CN115103063A CN202210631515.6A CN202210631515A CN115103063A CN 115103063 A CN115103063 A CN 115103063A CN 202210631515 A CN202210631515 A CN 202210631515A CN 115103063 A CN115103063 A CN 115103063A
Authority
CN
China
Prior art keywords
fraud
behavior
internet
users
call
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210631515.6A
Other languages
Chinese (zh)
Inventor
钟盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Sendi Computer System Co ltd
Original Assignee
Guangzhou Sendi Computer System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Sendi Computer System Co ltd filed Critical Guangzhou Sendi Computer System Co ltd
Priority to CN202210631515.6A priority Critical patent/CN115103063A/en
Publication of CN115103063A publication Critical patent/CN115103063A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2281Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements
    • H04W4/14Short messaging services, e.g. short message services [SMS] or unstructured supplementary service data [USSD]

Abstract

The invention discloses an internet fraud identification method and device aiming at counterfeit customer service classes, wherein the method comprises the following steps: selecting a target fraud-related scene set, and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait; according to the internet behavior characteristics and DPI data characteristics, combining the telephone call behavior characteristics and the short message transceiving behavior characteristics to construct characteristic projects of fraudulent users and deceived users; after the characteristic engineering is established, training the obtained fraud number sample through a LightGBM frame based on a learning algorithm of a decision tree to obtain a recognition model; and carrying out fraud identification on the number to be identified according to the identification model, and determining whether the number to be identified is a fraud telephone number. The invention has high identification accuracy and can be widely applied to the technical field of the Internet.

Description

Internet fraud identification method and device for counterfeit customer service classes
Technical Field
The invention relates to the technical field of Internet, in particular to an Internet fraud identification method and device for counterfeit customer service classes.
Background
With the development of mobile internet, more and more people have smart phones, and internet surfing by using the smart phones becomes an indispensable part of daily life of people. However, science and technology bring convenience and opportunities to fraud molecules, and according to statistics, in recent years, the duty of using internet means to implement fraud in public security case fraud is increased year by year, and traditional telephone fraud gradually turns to comprehensive fraud of telephone + internet means, so that the traditional telephone fraud is impossible to prevent.
Statistically, the largest proportion of current combination of internet fraud is counterfeit customer service type fraud, which establishes a preliminary connection with the victim by using a telephone call as a contact point and then induces the final account transfer of the victim by combining various types of fraud scripts by using internet means. The fraud cases are large in proportion and high in hazard degree, the involved money amounts are generally tens of thousands to hundreds of thousands, the social hazard is large, and the fraud needs to be found in a targeted manner and subjected to technical countermeasures.
The prior art does not provide an effective and targeted solution for the discovery of fake customer service type internet comprehensive fraud numbers.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a high-accuracy internet fraud identification method and apparatus for counterfeit customer service class.
One aspect of the embodiments of the present invention provides an internet fraud identification method for counterfeit customer service classes, including:
selecting a target fraud-related scene set, and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
according to the log information, the first behavior portrait and the second behavior portrait, excavating DPI data characteristics corresponding to internet behavior characteristics;
according to the internet behavior characteristics and the DPI data characteristics, combining the telephone call behavior characteristics and the short message transceiving behavior characteristics to construct characteristic projects of fraud users and deceived users;
after the characteristic engineering is established, training the obtained fraud number sample through a LightGBM frame based on a learning algorithm of a decision tree to obtain a recognition model;
and carrying out fraud identification on the number to be identified according to the identification model, and determining whether the number to be identified is a fraud telephone number.
Optionally, the type of internet behavior feature comprises at least one of: downloading a network credit APP characteristic, downloading an online conference APP, entering an online conference, opening network credit software, transferring accounts online, adding friends or adding a target group, wherein the target group comprises a financing group, an investment group or a list-swiping group.
Optionally, the type of the phone call behavior feature comprises at least one of: a phone call behavior characteristic of a deceived user or a call behavior characteristic of a fraudulent person.
Optionally, the telephone conversation behavior characteristics of the deceived user include:
a fraud-related call exists before the Internet behavior characteristic is executed by the deceived user, and the calling behavior of a calling number in the fraud-related call is abnormal;
the existing anomalies of the call behavior comprise at least one of: the called province number ratio exceeds a first threshold, the calling ratio exceeds a second threshold, the number activity days of the last 10 days are lower than a third threshold, the number of the called number attributive cities is higher than a fourth threshold, the dispersion of the called number is higher than a fifth threshold, the calling frequency is higher than a sixth threshold, the calling intensity is higher than a seventh threshold, the calling is simultaneously sent with the multiple numbers under the base station, and the used terminal is an IMEI (international mobile equipment identity) used by a known fraud molecule or dials a special service number.
Optionally, the call behavior characteristics of the fraud molecule comprise:
establishing communication with the victim via a telephone group call;
fraud on the victim in conjunction with the internet behavioral characteristics;
performing the phone call activity prior to performing the internet activity feature;
the number that conducts the internet fraud is at the same base station location as the number that conducts the telephone fraud.
Optionally, the short messaging behavior characteristic includes at least one of: the short message sending and receiving behavior characteristics of the deceived users and the short message sending and receiving behavior characteristics of the deceived users.
Optionally, the short messaging behavior characteristics of the spoofed user include: receiving the short message verification code in the cheating process, and sending the verification code to the cheating molecules in a short message sending mode;
the short messaging behavior characteristics of the fraud users comprise: and receiving a short message verification code sent by the victim.
In another aspect, an embodiment of the present invention further provides an internet fraud recognition apparatus for fake customer services, including:
the system comprises a first module, a second module and a third module, wherein the first module is used for selecting a target fraud-related scene set and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
the second module is used for mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait;
the third module is used for constructing feature projects of fraud users and cheated users according to the internet behavior features and the DPI data features and by combining the telephone call behavior features and the short message transceiving behavior features;
the fourth module is used for training the obtained fraud number samples through a LightGBM frame and a learning algorithm based on a decision tree after the characteristic engineering is established to obtain a recognition model;
and the fifth module is used for carrying out fraud identification on the number to be identified according to the identification model and determining whether the number to be identified is a fraud telephone number.
Another aspect of the embodiments of the present invention further provides an electronic device, including a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
Another aspect of the embodiments of the present invention also provides a computer-readable storage medium, which stores a program, and the program is executed by a processor to implement the method as described above.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.
Selecting a target fraud-related scene set, and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user; mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait; according to the internet behavior characteristics and the DPI data characteristics, combining the telephone call behavior characteristics and the short message transceiving behavior characteristics to construct characteristic projects of fraud users and deceived users; after the characteristic engineering is established, training the obtained fraud number sample through a LightGBM frame based on a learning algorithm of a decision tree to obtain a recognition model; and carrying out fraud identification on the number to be identified according to the identification model, and determining whether the number to be identified is a fraud telephone number. The method and the device improve the accuracy of fraud number identification.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating the overall steps provided by an embodiment of the present invention;
fig. 2 is a flowchart of machine learning modeling according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
To solve the problems in the prior art, an aspect of the embodiments of the present invention provides an internet fraud identification method for counterfeit customer service class, as shown in fig. 1, the overall method includes the following steps:
selecting a target fraud-related scene set, and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait;
according to the internet behavior characteristics and the DPI data characteristics, combining the telephone call behavior characteristics and the short message transceiving behavior characteristics to construct characteristic projects of fraud users and deceived users;
after the characteristic engineering is established, training the obtained fraud number sample through a LightGBM frame based on a learning algorithm of a decision tree to obtain a recognition model;
and carrying out fraud identification on the number to be identified according to the identification model, and determining whether the number to be identified is a fraud telephone number.
Optionally, the type of internet behavior feature comprises at least one of: downloading network credit APP characteristics, downloading an online conference APP, entering an online conference, opening network credit software, transferring accounts online, adding friends or joining a target group, wherein the target group comprises a financing group, an investment group or a billing group.
Optionally, the type of the phone call behavior feature comprises at least one of: a phone call behavior characteristic of a deceived user or a call behavior characteristic of a fraudulent person.
Optionally, the telephone conversation behavior characteristics of the deceived user include:
the cheated user has a fraud-related call before the Internet behavior characteristic is executed, and the calling behavior of the calling number in the fraud-related call is abnormal;
the existing anomalies of the call behavior comprise at least one of: the called province number ratio exceeds a first threshold, the calling ratio exceeds a second threshold, the number activity days of the last 10 days are lower than a third threshold, the number of the called number attributive cities is higher than a fourth threshold, the dispersion of the called number is higher than a fifth threshold, the calling frequency is higher than a sixth threshold, the calling intensity is higher than a seventh threshold, the calling is simultaneously sent with the multiple numbers under the base station, and the used terminal is an IMEI (international mobile equipment identity) used by a known fraud molecule or dials a special service number.
Optionally, the call behavior characteristics of the fraud molecule comprise:
establishing communication with the victim via a telephone group call;
fraud on the victim in conjunction with the internet behavioral characteristics;
performing the phone call activity prior to performing the internet activity feature;
the number conducting the internet fraud is located at the same base station location as the number conducting the telephone fraud.
Optionally, the short messaging behavior characteristic includes at least one of: the short message sending and receiving behavior characteristics of the deceived users and the short message sending and receiving behavior characteristics of the deceived users.
Optionally, the short messaging behavior characteristics of the spoofed user include: receiving the short message verification code in the cheating process, and sending the verification code to the cheating molecules in a short message sending mode;
the short messaging behavior characteristics of the fraud users comprise: and receiving a short message verification code sent by the victim.
In another aspect, an embodiment of the present invention further provides an internet fraud recognition apparatus for fake customer services, including:
the system comprises a first module, a second module and a third module, wherein the first module is used for selecting a target fraud-related scene set and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
the second module is used for mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait;
the third module is used for constructing feature projects of fraud users and cheated users according to the internet behavior features and the DPI data features and by combining the telephone call behavior features and the short message transceiving behavior features;
the fourth module is used for training the obtained fraud number samples through a LightGBM frame and a learning algorithm based on a decision tree after the characteristic engineering is established to obtain a recognition model;
and the fifth module is used for carrying out fraud identification on the number to be identified according to the identification model and determining whether the number to be identified is a fraud telephone number.
Another aspect of the embodiments of the present invention further provides an electronic device, including a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
Yet another aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a program, which is executed by a processor to implement the method as described above.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.
The following detailed description of the invention is made with reference to the accompanying drawings:
the invention can effectively discover the comprehensive fraud behaviors of fake customer service type Internet in the network, discover and locate fraud numbers and provide clue assistance to attack the telecommunication network fraud from the source based on the combination of the DPI data of the Internet log, the signaling data (including telephone and short message signaling records) and the fraud user behavior portrait characteristics of the fraud script.
Firstly, the embodiment of the invention collects the number samples for fraud by using the internet means from the related database, and selects several types of fraud-related scenes with large proportion as shown in the following table 1, so that most of the fraud-related scenes belong to counterfeit customer service types.
TABLE 1
Figure BDA0003680149910000061
For this reason, the embodiment collects relevant data based on fraud samples, including DPI data and signaling data (including phone and sms signaling records), and combs the behavior portraits of typical fraud users and deceived users immediately after the combination of various types of currently popular fraud scripts, as shown in table 2:
TABLE 2
Figure BDA0003680149910000071
Based on the above analysis, the embodiment of the present invention summarizes three main types of fraud scenes according to the difference of fraud means, as shown in table 3:
TABLE 3
Figure BDA0003680149910000072
Figure BDA0003680149910000081
According to the summarized user behavior portrait characteristics, the embodiment of the invention deeply excavates the DPI data characteristics corresponding to the internet behavior characteristics based on the fraud number samples and the DPI data of the internet log, as shown in table 4:
TABLE 4
Figure BDA0003680149910000082
According to the internet behavior and DPI characteristics, the embodiment of the invention combines the comprehensive analysis of the telephone call and short message sending characteristics in the signaling data to construct the characteristic engineering of the fraud users and the deceived users, and the characteristic engineering comprises the following multidimensional characteristics:
1. internet behavior characteristics include the following broad categories:
1-1, downloading network credit APP characteristics, belonging to behavior characteristics possessed by cheated users, specifically comprising downloading Jingdong finance APP, downloading a finance with small download degree APP, downloading a 360-debit APP and the like.
1-2, downloading online conference APP, belonging to behavior characteristics of cheated users, specifically comprising downloading Tencent conference APP, downloading enterprise WeChat APP, downloading flybook APP, downloading Jingdong video APP and the like.
1-3, entering an online conference, belonging to behavior characteristics of a fraud user and a deceived user, specifically comprising opening a flight conference APP, opening an enterprise WeChat APP, opening a flight book APP, opening a Jingdong video APP and the like.
1-4, opening network loan software, belonging to behavior characteristics of cheated users, specifically comprising opening payment Application (APP), opening white bars in the Jingdong, gold bars in the Jingdong, financial bars in the Jingdong and the like.
1-5, transferring accounts online, belonging to the behavior characteristics of cheated users, and specifically comprising transferring accounts through WeChat, Payment treasures and bank software.
1-6, adding friends, belonging to behavior characteristics of cheating and deceived users, specifically comprising adding WeChat friends and adding QQ friends
1-7, adding financing/investment/list-swiping groups, belonging to behavior characteristics of fraud and deceived users, specifically comprising adding WeChat groups and QQ groups.
2. Telephone call behavior characteristics include the following categories:
2-1, behavior characteristics of deceived users: in most cases, the deceived user performs an online meeting or an online network loan or other online operations in accordance with the instructions of the fraudulent parties after receiving the fraud call. Therefore, the method for analyzing the telephone call behavior characteristics of the cheated user before the internet behavior characteristics mainly comprises the following points:
2-1-1, the cheated user has a communication and fraud-related conversation before the internet behavior characteristics;
2-1-2, and the calling behavior of the calling number in the call is abnormal, which specifically includes the following features related to fraud:
2-1-2-1, the occupied ratio of called external province numbers (the number of the external province numbers in the called number/the total number of the called number) is high;
2-1-2-2, high calling occupation ratio (calling times/total calling and called times);
2-1-2-3, the number activity days of the first 10 days are low (the number of days of the call, the traffic and the short message use record in the first 10 days);
2-1-2-4, the number of places to which the called number belongs is higher;
2-1-2-5, the dispersion of the called number is higher;
2-1-2-6, the number of calls is often more;
2-1-2-7, high call strength (number of calls per hour);
2-1-2-8, and simultaneously calling multiple numbers under the same base station (group committing fraud characteristic);
2-1-2-9, the terminal used is the IMEI used by a known fraud molecule;
2-1-2-10, special service numbers are often dialed.
For example, for the feature of dialing a special service number in 2-1-2-10, several types of typical fraud features known at present by dialing a special number such as 10086,114 are exemplified by the following 4 types:
dialing 114 fraud scenarios:
the lawbreaker utilizes the trust of the person at 114 to register the telephone number of a counterfeit financial institution at 114 to create bank remittance artifacts and perform fraudulent dealings.
In such fraud cases, the lawbreaker often registers the mobile phone number at 114 using the feature that the mobile phone number is not well discriminated from the fixed phone number in the name of the financial institution that has not established a branch in the area. The victim is then sent counterfeit bank remittance documents and requested to supply. When the victim consults the fake financial institution telephone checked through the 114 directory enquiry station, the lawbreaker can fake the financial institution staff, confirm the fake remittance information for the victim, make the false thing of remittance already, and trap the victim to work.
Dialing 10086 or 10010 fraud scenes:
fraud molecules typically fraud with cards from three operators, where there are two scenarios:
1) before fraud, the card test or the call charge balance inquiry is carried out, at the moment, the fraud molecule does not know the operator to which the mobile phone card belongs, namely the telecommunication card is used, but because the market share of China mobile is the largest, the fraud molecule can test the card or inquire the call charge balance by dialing 10086, at the moment, the 10086 automatic customer service prompts that 'you use a non-Chinese mobile number to make a call …', and then the fraud molecule can dial the Chinese Unicom customer service 10010 …
2) Fraud molecules may complain by dialing 10086 or 10010 if intercepted or stopped by the carrier anti-fraud platform during fraud.
③ Dial 1008611 fraud scenarios:
as above, a fraudster may query for a credit balance by dialing 1008611.
Fourthly, dialing a bank short number fraud scene:
at present, fraud is mostly to cheat the victim money through bank transfer remittance, and after the fraud is successful, a bank short number may be dialed to confirm whether the money is found.
In addition to the above, the special service numbers used by the model are researched on historical fraud number samples and methods, so that various institutions such as WeChat dialing, Paibao payment, Jingdong finance and the like are added to serve as special service number libraries, and are not exhaustive one by one as the fraud methods are dynamically updated. .
2-2, fraudulent molecular behavior characteristics: according to the above analysis, the fraud molecule usually uses the telephone group call to communicate with the victim and then implements further fraud by combining the internet means such as online meeting, so there is the relevant feature of telephone conversation behavior before the fraud molecule implements internet behavior fraud, but the number implementing internet fraud and the number implementing telephone fraud are not necessarily the same number, but may be the same location (base station):
2-2-1, the fraud molecule has a talk-through before the above-mentioned internet behavior feature;
2-2-2, the number performing the internet fraud and the number performing the telephone fraud are not necessarily the same number but may be the same location, so if there are other numbers with abnormal calling behavior characteristics at a time close to the base station, the analysis is also performed simultaneously:
2-2-2-1, the occupied ratio of called external province numbers (the number of the external province numbers in the called number/the total number of the called number) is high;
2-2-2-2, calling occupation ratio (calling times/total calling and called times);
2-2-2-3, the number activity days of the first 10 days are low (the number of days of the call, the traffic and the short message use record in the first 10 days);
2-2-2-4, the number of places to which the called number belongs is higher;
2-2-2-5, the dispersion of the called number is higher;
2-2-2-6, the number of calls is often large;
2-2-2-7, high call strength (number of calls per hour);
2-2-2-8, special service numbers are often dialed.
3. The short message receiving and sending behavior characteristics comprise:
3-1, behavior characteristics of deceived users:
3-1-1, the cheated user receives the short message verification code in the cheating process under partial conditions;
3-1-2, sending the verification code to the fraud molecule mobile phone in a short message sending mode.
3-2, fraudulent user behavior characteristics:
and 3-2-1, receiving the mobile phone short message verification code sent by the victim.
4. After the characteristic engineering is initially established, the embodiment of the invention trains and runs by selecting an algorithm, and the LightGBM is a gradient Boosting framework and uses a learning algorithm based on a decision tree. The method has the advantages of higher training efficiency, low memory use, higher accuracy, support of parallelization learning, capability of processing large-scale data and the like.
In this embodiment, based on massive signaling data, a machine learning algorithm is used for modeling, multidimensional data characteristics are considered comprehensively, a large number of fraud numbers captured historically are used as samples for training, a schematic diagram is shown in fig. 2, wherein a selected model in the embodiment of the present invention is: LightGBM. LightGBM is a gradient Boosting framework using a decision tree based learning algorithm. The sample selection of the embodiment of the invention is as follows: training and verifying according to the historically accumulated fraud numbers as a sample. The characteristics of the embodiment of the invention are selected as follows: the characteristic aspect comprehensively considers the fraud characteristics of three-dimensional data of internet behaviors, phone call behaviors and short message behaviors.
Finally, the invention trains to obtain an internet fraud number discovery and victim discovery model which has both precision and recall rate.
In conclusion, the invention provides an effective targeted solution for the discovery of fake customer service type internet comprehensive fraud numbers, can effectively and accurately discover the fraud numbers and potential victim numbers, and realizes attack arrest of fraud molecules and early warning dissuasion of victims.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be understood that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, for example, as a sequential list of executable instructions that may be thought of as implementing logical functions, may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that may fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Further, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An internet fraud identification method for counterfeit customer service classes, comprising:
selecting a target fraud-related scene set, and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
according to the log information, the first behavior portrait and the second behavior portrait, excavating DPI data characteristics corresponding to internet behavior characteristics;
according to the internet behavior characteristics and the DPI data characteristics, combining the telephone call behavior characteristics and the short message transceiving behavior characteristics to construct feature projects of fraud users and deceived users;
after the characteristic engineering is established, training the obtained fraud number sample through a LightGBM frame based on a learning algorithm of a decision tree to obtain a recognition model;
and carrying out fraud identification on the number to be identified according to the identification model, and determining whether the number to be identified is a fraud telephone number.
2. The method of claim 1, wherein the type of the internet behavior signature comprises at least one of: downloading network credit APP characteristics, downloading an online conference APP, entering an online conference, opening network credit software, transferring accounts online, adding friends or joining a target group, wherein the target group comprises a financing group, an investment group or a billing group.
3. The method of claim 1, wherein the type of the phone call behavior feature comprises at least one of: a phone call behavior characteristic of a deceived user or a call behavior characteristic of a fraudulent person.
4. The method of claim 3, wherein the phone call behavior of the deceived user comprises:
the cheated user has a fraud-related call before the Internet behavior characteristic is executed, and the calling behavior of the calling number in the fraud-related call is abnormal;
the existing abnormality of the call behavior comprises at least one of the following: the called province number ratio exceeds a first threshold, the calling ratio exceeds a second threshold, the number activity days of the last 10 days are lower than a third threshold, the number of the called number attributive cities is higher than a fourth threshold, the dispersion of the called number is higher than a fifth threshold, the calling frequency is higher than a sixth threshold, the calling intensity is higher than a seventh threshold, the calling is simultaneously sent with the multiple numbers under the base station, and the used terminal is an IMEI (international mobile equipment identity) used by a known fraud molecule or dials a special service number.
5. The method of claim 3, wherein the call behavior characteristics of the fraud molecules comprise:
establishing communication with the victim via a telephone group call;
fraud on the victim in conjunction with the internet behavioral characteristics;
performing the phone call activity prior to performing the internet activity feature;
the number that conducts the internet fraud is at the same base station location as the number that conducts the telephone fraud.
6. The method of claim 1, wherein the SMS behavior feature comprises at least one of: the short message sending and receiving behavior characteristics of the deceived users and the short message sending and receiving behavior characteristics of the deceived users.
7. The method of claim 6, wherein the Internet fraud detection system comprises a plurality of Internet fraud detection systems,
the short message sending and receiving behavior characteristics of the cheated user comprise: receiving the short message verification code in the cheating process, and sending the verification code to the cheating molecules in a short message sending mode;
the short messaging behavior characteristics of the fraud users comprise: and receiving a short message verification code sent by the victim.
8. An internet fraud recognition apparatus for counterfeit customer services, comprising:
the system comprises a first module, a second module and a third module, wherein the first module is used for selecting a target fraud-related scene set and acquiring fraud number samples in the target fraud-related scene set and sample information corresponding to the fraud number samples; wherein the samples comprise internet behavioral characteristics, log information of fraud numbers, a first behavioral representation of a fraudulent user, and a second behavioral representation of a deceived user;
the second module is used for mining DPI data characteristics corresponding to the internet behavior characteristics according to the log information, the first behavior portrait and the second behavior portrait;
the third module is used for constructing feature projects of fraud users and cheated users according to the internet behavior features and the DPI data features and by combining the telephone call behavior features and the short message transceiving behavior features;
the fourth module is used for training the obtained fraud number samples through a LightGBM frame and a learning algorithm based on a decision tree after the characteristic engineering is established to obtain a recognition model;
and the fifth module is used for carrying out fraud identification on the number to be identified according to the identification model and determining whether the number to be identified is a fraud telephone number.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program realizes the method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the storage medium stores a program which is executed by a processor to implement the method of any one of claims 1 to 7.
CN202210631515.6A 2022-06-06 2022-06-06 Internet fraud identification method and device for counterfeit customer service class Pending CN115103063A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210631515.6A CN115103063A (en) 2022-06-06 2022-06-06 Internet fraud identification method and device for counterfeit customer service class

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210631515.6A CN115103063A (en) 2022-06-06 2022-06-06 Internet fraud identification method and device for counterfeit customer service class

Publications (1)

Publication Number Publication Date
CN115103063A true CN115103063A (en) 2022-09-23

Family

ID=83289548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210631515.6A Pending CN115103063A (en) 2022-06-06 2022-06-06 Internet fraud identification method and device for counterfeit customer service class

Country Status (1)

Country Link
CN (1) CN115103063A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659340A (en) * 2022-12-09 2023-01-31 支付宝(杭州)信息技术有限公司 Counterfeit applet identification method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659340A (en) * 2022-12-09 2023-01-31 支付宝(杭州)信息技术有限公司 Counterfeit applet identification method and device, storage medium and electronic equipment
CN115659340B (en) * 2022-12-09 2023-03-14 支付宝(杭州)信息技术有限公司 Counterfeit applet identification method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN106791220B (en) Method and system for preventing telephone fraud
Barson et al. The detection of fraud in mobile phone networks
CN107819747B (en) Telecommunication fraud association analysis system and method based on communication event sequence
CN110401779A (en) A kind of method, apparatus and computer readable storage medium identifying telephone number
CN110248322B (en) Fraud group partner identification system and identification method based on fraud short messages
CN109214914A (en) A kind of loan information checking method and device based on communication open platform
CN111222025A (en) Fraud number identification method and system based on convolutional neural network
CN108259680B (en) Fraud call identification method and device and server for identifying fraud calls
CN109246319A (en) A kind of calling name card business implementation method, device, equipment and storage medium
CN108966226A (en) The method for processing business and device of identity-based information
CN112866192B (en) Method and device for identifying abnormal aggregation behaviors
CN108134998A (en) Information fraud method for early warning and system based on mobile big data
CN109547942A (en) Swindle number identification method, device, equipment and computer readable storage medium
CN109145050B (en) Computing device
CN110611929A (en) Abnormal user identification method and device
CN109474923A (en) Object identifying method and device, storage medium
CN113206909A (en) Crank call interception method and device
CN115103063A (en) Internet fraud identification method and device for counterfeit customer service class
CN111901790A (en) Method, device, electronic device and storage medium for identifying telecommunication fraud
CN110347566A (en) For carrying out the method and device of measures of effectiveness to registration air control model
CN107705126B (en) Transaction instruction processing method and device
CN111105064A (en) Method and device for determining suspected information of fraud event
CN108717418A (en) A kind of data correlation method and device based on different data sources
CN105354787A (en) Communication real-name system based communication money management system
CN102256255A (en) Detection method for parallel-used-card proof based on time and geographic location collisions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination