WO2022021977A1 - Procédé et appareil de détection de compte de l'industrie souterraine, dispositif informatique et support - Google Patents

Procédé et appareil de détection de compte de l'industrie souterraine, dispositif informatique et support Download PDF

Info

Publication number
WO2022021977A1
WO2022021977A1 PCT/CN2021/090947 CN2021090947W WO2022021977A1 WO 2022021977 A1 WO2022021977 A1 WO 2022021977A1 CN 2021090947 W CN2021090947 W CN 2021090947W WO 2022021977 A1 WO2022021977 A1 WO 2022021977A1
Authority
WO
WIPO (PCT)
Prior art keywords
account
field data
data document
word
weight
Prior art date
Application number
PCT/CN2021/090947
Other languages
English (en)
Chinese (zh)
Inventor
孙家棣
马宁
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022021977A1 publication Critical patent/WO2022021977A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method, device, computer equipment, and computer-readable storage medium for detecting a fraudulent account based on artificial intelligence.
  • the inventor realizes that, at present, business risk identification needs to crack down on illegal activities and identify and crack down on fake illegal accounts.
  • the industry currently mainly uses expert rules of thumb to identify and combat fake accounts.
  • the identification surface of expert rules is relatively simple and narrow, mainly targeted and accurate identification and attack, because the logic is relatively simple, and it is easy to be identified and bypassed by illegal behavior.
  • the purpose of this application is to provide a method, device, computer equipment and computer-readable storage medium for detecting a fraudulent account based on artificial intelligence.
  • an artificial intelligence-based black production account detection method including:
  • the account attribute data set of the account is obtained, and the mobile phone number is derived from the account database of the target subject;
  • a cluster of black production accounts is determined, and a group of black production accounts associated with the target subject is obtained.
  • an artificial intelligence-based black production account detection device including:
  • an acquisition module configured to acquire an account attribute data set of the account when it is determined that the number of accounts bound to the mobile phone number exceeds a predetermined number, and the mobile phone number is derived from the account database of the target subject;
  • a building module used to use the attribute field data in the account attribute data set as a connection edge, and use the mobile phone number as a vertex to construct an account detection graph of the target subject;
  • a clustering module configured to perform graph clustering on the accounts in the account detection diagram based on the attribute field data of the connection edges in the account detection diagram, to obtain a plurality of account clusters
  • the generating module is used for using the attribute field data of each of the account clusters to generate the first field data document of each of the account clusters, and to obtain the second field of the whitelist account corresponding to the target subject data files;
  • a calculation module configured to calculate the weight of each word in the first field data document according to the first field data document and the second field data document, and the weight indicates that each of the words is in the first field data document. importance in a field data document relative to the second field data document;
  • the determining module is configured to determine the black production account clusters based on the weight of each of the words, and obtain the black production account groups associated with the target subject.
  • a computer device including a memory and a processor, where the memory is configured to store a program for detecting a fraudulent account based on artificial intelligence of the processor, and the processor is configured to execute the program based on the artificial intelligence
  • the artificial intelligence black product account detection program performs the following processing: when it is determined that the number of accounts bound to the mobile phone number exceeds a predetermined number, the account attribute data set of the account is obtained, and the mobile phone number is derived from the account database of the target subject; The attribute field data in the account attribute data set is used as a connection edge, and the mobile phone number is used as a vertex to construct an account detection graph of the target subject; based on the attribute field data of the connection edge in the account detection graph, the account Perform graph clustering on the accounts in the detection graph to obtain a plurality of account clusters; use the attribute field data of each of the account clusters to generate the first field data document of each of the account clusters, and obtain all the account clusters.
  • a computer-readable storage medium storing computer-readable instructions, on which is stored a program for detecting a fraudulent account based on artificial intelligence, and the program for detecting a fraudulent account based on artificial intelligence is processed
  • the device When the device is executed, the following processing is implemented: when it is determined that the number of accounts bound to the mobile phone number exceeds a predetermined number, the account attribute data set of the account is obtained, and the mobile phone number is derived from the account database of the target subject; the account attribute data set is collected.
  • the attribute field data is used as the connection edge, and the mobile phone number is used as the vertex to construct the account detection graph of the target subject; based on the attribute field data of the connection edge in the account detection graph, the account in the account detection graph is graphed. to obtain a plurality of account clusters; use the attribute field data of each of the account clusters to generate the first field data document of each of the account clusters, and obtain the whitelist corresponding to the target subject.
  • the second field data document of the account according to the first field data document and the second field data document, calculate the weight of each word in the first field data document, and the weight indicates that each word is in the The importance in the first field data document relative to the second field data document; determine the black production account cluster cluster based on the weight of each of the words, and obtain the black production account group associated with the target subject .
  • the above-mentioned artificial intelligence-based black production account detection method, device, computer equipment and computer-readable storage medium first, when it is determined that the number of accounts bound to the mobile phone number derived from the account database of the target subject exceeds a predetermined number, the account number of the account is obtained.
  • Attribute data set preliminary screening of accounts in the target subject, excluding accounts whose mobile phone numbers are bound to accounts less than a predetermined number, obtain the account attribute data set to be detected, narrow the detection range and improve the detection reliability; then, the account attribute data
  • the centralized attribute field data is used as a connection edge, and the mobile phone number is used as a vertex to construct an account detection graph of the target subject; based on the attribute field data of the connection edge in the account detection graph, graph clustering is performed on the accounts in the account detection graph to obtain multiple Account clustering cluster; build a graph by taking the mobile phone number of the associated account as a vertex, and then clustering the account clustering cluster based on the attribute field data graph, and reliable clustering to obtain the account gang; then, using the attribute field of each account clustering cluster data, generate the first field data document of each account cluster, and obtain the second field data document of the whitelist account corresponding to the target subject; it is convenient to perform data analysis based on the data document, and at the same time pass the normal second
  • FIG. 1 schematically shows a flow chart of a method for detecting a fraudulent account based on artificial intelligence.
  • FIG. 2 schematically shows an example diagram of an application scenario of an artificial intelligence-based black production account detection method.
  • FIG. 3 schematically shows a flow chart of a method for acquiring an account attribute data set of an account.
  • FIG. 4 schematically shows a block diagram of an artificial intelligence-based black product account detection device.
  • FIG. 5 schematically shows an example block diagram of a computer device for implementing the above-mentioned artificial intelligence-based black production account detection method.
  • FIG. 6 schematically shows a computer-readable storage medium for implementing the above-mentioned artificial intelligence-based black product account detection method.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
  • numerous specific details are provided in order to give a thorough understanding of the embodiments of the present application.
  • those skilled in the art will appreciate that the technical solutions of the present application may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed.
  • well-known solutions have not been shown or described in detail to avoid obscuring aspects of the application.
  • the artificial intelligence-based black production account detection method can be run on a server, a server cluster or a cloud server, etc.
  • the method of the present application can also be executed on other platforms according to requirements, which is not particularly limited in this exemplary embodiment.
  • the artificial intelligence-based black production account detection method may include the following steps:
  • Step S110 when it is determined that the number of accounts bound to the mobile phone number exceeds a predetermined number, obtain the account attribute data set of the account, and the mobile phone number is derived from the account database of the target subject;
  • Step S120 using the attribute field data in the account attribute data set as a connection edge, and using the mobile phone number as a vertex to construct an account detection graph of the target subject;
  • Step S130 performing graph clustering on the accounts in the account detection diagram based on the attribute field data of the connected edges in the account detection diagram, to obtain a plurality of account clusters;
  • Step S140 using the attribute field data of each of the account clusters, generate the first field data document of each of the account clusters, and obtain the second field data document of the whitelist account corresponding to the target subject ;
  • Step S150 calculate the weight of each word in the first field data document, the weight indicating that each word is in the first field importance in the data document relative to the second field in the data document;
  • Step S160 Determine a cluster of black product accounts based on the weight of each of the words, and obtain a black product account group associated with the target subject.
  • the account attribute data set of the account is obtained; Preliminary screening, excluding accounts whose mobile phone numbers are bound to an account number smaller than a predetermined number, obtains the account attribute data set to be detected, narrows the detection range and improves the detection reliability.
  • the attribute field data in the account attribute data set is used as the connecting edge, and the mobile phone number is used as the vertex to construct the account detection graph of the target subject;
  • Clustering is performed to obtain multiple account clusters; a graph is constructed by taking the mobile phone numbers of the associated accounts as vertices, and then based on attribute field data graph clustering, account clusters are obtained, and account groups are obtained by reliable clustering.
  • the weight of each word in the first field data document is calculated, so as to determine the black product account clusters based on the weight of each said word, the weight indicating each The importance of a word in the first field data document relative to the second field data document. Whether the account cluster is a black-producing account group can be reliably determined by the importance of each word in the first field data document relative to the second field data document.
  • step S110 when it is determined that the number of accounts bound to the mobile phone number exceeds a predetermined number, an account attribute data set of the account is obtained, and the mobile phone number is derived from the account database of the target subject.
  • the server 210 may obtain the account attribute data set of the account associated with the target subject from the server 220; then, the server 210 may determine that the number of accounts bound to the user's mobile phone number exceeds When the predetermined number is reached, the account attribute data sets of all accounts corresponding to the mobile phone numbers whose number of bound accounts exceeds the predetermined number are acquired.
  • the server 210 and the server 220 may be various terminal devices with an instruction processing function and a data storage function, such as a computer and a mobile phone, which are not specially limited herein.
  • the server 210 and the server 220 are node servers in the blockchain, and based on the immutability and security of the data in the blockchain, the server 210 can safely and reliably obtain the association of the target subject from the server 220 The account attribute data set of the account.
  • the account attribute data set of each account includes field data of account-related attribute fields, which may include field data of related attribute fields such as mobile phone number, device, network environment, and login password, such as account password, mobile phone number, and login device id.
  • the target subject can be any enterprise or platform.
  • the predetermined number can be set according to the actual situation and is associated with a preset mobile phone number.
  • the standard number of accounts, and the number of accounts associated with a mobile phone number exceeds this threshold, indicating that there is a suspicion of black production, for example, it can be 5 and so on.
  • account attribute data such as the network environment, device parameters, and registration passwords collected by the application can be used.
  • Network black products may disguise dimensions such as network environment, device parameters, registration passwords, etc., but it is impossible to bypass the account indicator requirements for registering and binding the same user's mobile phone number.
  • the target entity's requirement for the user's mobile phone number to register and bind the account is to take effect at the end of each month, it can first obtain accounts according to the predetermined number set by the target entity's bound account number within one month of binding the same user's mobile phone number. attribute dataset.
  • the account attribute data set of the account is obtained, and all accounts associated with the target subject can be preliminarily screened, and accounts whose mobile phone number is bound to an account number less than the predetermined number are excluded. , to obtain the remaining account attribute data sets to be detected, which reduces the detection range and improves the detection accuracy.
  • acquiring the account attribute data set of the account includes:
  • Step S310 obtain the business association condition between the target entity and the mobile phone number, the business association condition indicates the threshold of the number of accounts that can be bound to the mobile phone number in the target business, and the target business originates from the target entity ;
  • Step S320 when the account number bound with the mobile phone number exceeds the number threshold, acquire the account attribute data set of the account number.
  • the business association condition indicates the threshold of the number of accounts that can be bound to the user's mobile phone number in the target business, that is, the threshold of the number of accounts that can be bound to the mobile phone number set in a certain business activity held by the target entity, which is suitable for the target business. Realize accurate monitoring of suspected black-produced accounts according to different businesses.
  • step S120 the attribute field data in the account attribute data set is used as a connection edge, and the mobile phone number is used as a vertex to construct an account detection graph of the target subject.
  • the attribute field data in the account attribute data set is used as the connection edge and the mobile phone number is used as the vertex to construct the detection graph, that is, the mobile phone number associated with the account is used as the vertex, and the association relationship between the accounts according to the field data, Using the field as a connection edge, connect the associated accounts to obtain a detection graph, which can include various associations between the acquired accounts.
  • the attribute field data in the account attribute data set is used as a connecting edge
  • the mobile phone number is used as a vertex to construct an account detection graph of the target subject, including:
  • the fingerprint type field at least includes the login device ID, login password and the boot time of the login device
  • the category type field at least includes the login device model, system version, The total storage space of the device, the login network address, and the physical address of the wireless network card
  • the combination of the first predetermined number of the fingerprint-type fields and the combination of the second predetermined number of the category-type fields in the account attribute data set is used as a connection edge, and the corresponding field data combination is combined.
  • the mobile phone number is used as a vertex to construct an account detection graph.
  • fingerprint-type fields any first predetermined number of field data can be used as a connection edge of the detection graph; while a category type field requires a second predetermined number of field data, which can be put together as a connection edge of the detection map.
  • the first predetermined number is 2, and the second predetermined number is greater than or equal to 3 and less than or equal to 5.
  • a single field of the fingerprint field can be used as a connection edge for filtering, or two combinations can be used as edges together, which can effectively avoid accidental collisions.
  • the field data of a certain fingerprint field changed by a black product happened to be the same as that of a normal account.
  • the two were combined together and used as a connection edge, which reduced the probability of accidental injury and collision.
  • multiple categorical fields are put together to filter data more accurately.
  • Ios system fingerprint type variables are the logon device identification id, the logon password, and the boottime boottime of the logon device.
  • the correspondence between single field data and the number of mobile phone numbers is as follows (a1-a3):
  • categorical variables include login device model, system version, total device storage space, login network address ip, physical address of wireless network card wifimac, etc.
  • the corresponding relationship between the single field data and the number of mobile phone numbers is (b1-b2): (b1) The relationship between the number of models and the number of mobile phone numbers is 1:28470.36, and the total number of models is usually 70. (b2) The relationship between the total storage space of the device and the number of mobile phone numbers is 1:134.34. Through the combination, the number of corresponding mobile phone numbers can be effectively reduced.
  • Step S130 Perform graph clustering on the accounts in the account detection diagram based on the attribute field data of the connection edges in the account detection diagram to obtain a plurality of account clusters.
  • the account detection graph can be graph-clustered using the existing graph clustering method to obtain account clusters.
  • the relationship network of accounts can be constructed based on the construction of the account detection graph, and the accounts can be clustered based on the attribute field data to obtain similar account clusters.
  • graph clustering is performed on the accounts in the account detection diagram based on the attribute field data of the connection edges in the account detection diagram to obtain a plurality of account clusters, including:
  • the account detection graph is subjected to graph clustering processing using the Connected Component algorithm to obtain multiple account groups;
  • account groups that include a number of mobile phone numbers greater than or equal to a predetermined number and are associated with the same login network address, and obtain a first account group combination;
  • account groups that include a number of mobile phone numbers greater than or equal to a predetermined number and are associated with physical addresses of the same wireless network card, and obtain a second account group combination;
  • the first account group combination and the second account group combination are determined as the account cluster.
  • the mobile phone number is the vertex, and the connected edges defined in the above steps are calculated by graph clustering using the Connected Component algorithm to obtain multiple node clusters.
  • the Connected Components algorithm that is, the connected body algorithm labels each connected body (multiple account groups) in the graph with an ID, and uses the ID of the vertex with the smallest sequence number in the connected body as the ID of the connected body. If there is a path between any two vertices (mobile phone numbers) in the graph G, then G is called a connected graph, otherwise the graph is called a non-connected graph, and the maximally connected subgraph is called a connected body.
  • a second graph clustering taking the group number (identification id) of the first clustering result as the vertex, first, from the multiple account groups, obtain the number of mobile phone numbers that are greater than or equal to a predetermined number and are associated with Obtain a first account group combination from account groups with the same login network address. For example, acquire account groups that include a number of mobile phone numbers greater than or equal to 3 and are associated with the same login network address to obtain the first account group combination.
  • the account groups that include mobile phone numbers greater than or equal to the predetermined number and are associated with the physical address of the same wireless network card, and obtain a second account group combination, for example, (obtaining The number of accounts is greater than or equal to 3 and is associated with the account group of the physical address of the same wireless network card, and the second account group combination is obtained.
  • the use of the quadratic graph clustering mainly corresponds to the second dial dynamic ip (login network address, the physical address of the wireless network card ) and merging small groups that were supposed to be the same gang.
  • the black product will disguise the ip of several mobile phone numbers and change the ip or wifimac.
  • Step S140 using the attribute field data of each of the account clusters, generate the first field data document of each of the account clusters, and obtain the second field data document of the whitelist account corresponding to the target subject .
  • the first field data document and the second field data document may be text documents or tables.
  • the whitelist account corresponding to the target subject can be the account attribute data set of the internal user of the subject corresponding to the target subject.
  • the account-related data of an employee of an organization can be determined as non-black data. .
  • the second field data document of the whitelist account corresponding to the target subject can be generated from the attribute field data of the whitelist account.
  • the normal second field data file is used as the comparison of the first field data file to ensure the accuracy of the detection of the black product account.
  • Step S150 calculate the weight of each word in the first field data document, the weight indicating that each word is in the first field Importance in the data document relative to the second field in the data document.
  • the first field data document of each account cluster can be obtained by calculating a weight indicating the importance of each word in the first field data document relative to the second field data document Words with "unique" weights (that is, attribute field data), and furthermore, if there is "unique" attribute field data of the group in the account gang, there is a high probability that it is a simulator parameter modified by black products.
  • calculating the weight of each word in the first field data document according to the first field data document and the second field data document including:
  • the product of the first frequency and the second frequency is used as the weight of each of the words.
  • Calculate the first frequency that each word appears in the first field data document, and the importance of each word in the first field data document to be detected can be obtained; then, calculate the first field data document and the first field data document of each word.
  • the second frequency that appears simultaneously in the two-field data document can obtain the global importance of each word.
  • the product of the first frequency and the second frequency is used as the weight of each word, which can be obtained from the perspective of the global data set through the weight. Indicates the importance of each word in the first field data document relative to the second field data document.
  • calculating the weight of each word in the first field data document according to the first field data document and the second field data document including:
  • the TF-IDF algorithm can accurately and efficiently identify the account of the black gang (
  • There are words with large TF-IDF weights in the gang account indicating that there are words that are "unique" for the gang account in the gang account, and the high probability is the simulator parameter.
  • detection resources can be saved.
  • more black gang accounts can be fished out according to the TF-IDF weight sorting. Experiments have shown that by sorting according to this standard, more black gang accounts can be found under the same number of detections.
  • Step S160 Determine a cluster of black product accounts based on the weight of each of the words, and obtain a black product account group associated with the target subject.
  • the weight of each described word is used to determine the clusters of black production accounts, including:
  • the account cluster corresponding to the black product data document is determined as a black product account group.
  • the predetermined weight can be set according to the actual situation. There are words with weights higher than the predetermined weights, indicating that the data in the first field data document from which the words with weights higher than the predetermined weights come from are abnormal, and it is determined as a black production data document, and then, yes, the account corresponding to the black production data document is The cluster is determined to be a black-produced account gang.
  • the weight of each described word is used to determine the clusters of black production accounts, including:
  • the account cluster corresponding to the black product data document is determined as a black product account group.
  • the weights of all words can be comprehensively considered, and the abnormal situation of account clusters can be considered globally based on the first field data document. Furthermore, the first field data document whose weight average value is higher than the predetermined average value is determined as the black product data document, and the black product account gang can be reliably detected globally.
  • the present application also provides an artificial intelligence-based black production account detection device.
  • the artificial intelligence-based black production account detection device may include an acquisition module 410 , a construction module 420 , a clustering module 430 , a generation module 440 , a calculation module 450 and a determination module 460 . in:
  • the acquisition module 410 can be used to determine that when the number of accounts bound to the user's mobile phone number exceeds a predetermined number, acquire the account attribute data set of the account, and the user is associated with the target subject;
  • the building module 420 can be configured to use the attribute field data in the account attribute data set as a connection edge, and use the mobile phone number as a vertex to construct an account detection graph of the target subject;
  • the clustering module 430 may be configured to perform graph clustering on the accounts in the account detection diagram based on the attribute field data of the connection edges in the account detection diagram, to obtain a plurality of account clusters;
  • the generating module 440 can be configured to use the attribute field data of each of the account clusters to generate the first field data document of each of the account clusters, and obtain the second data of the whitelist account corresponding to the target subject. field data document;
  • the calculation module 450 may be configured to calculate the weight of each word in the first field data document according to the first field data document and the second field data document, the weight indicating that each word is in the importance in the first field data document relative to the second field data document;
  • the determining module 460 may be configured to determine a cluster of black product accounts based on the weight of each of the words, and obtain a black product account group associated with the target subject.
  • the obtaining module is further configured to:
  • the business association condition indicates the threshold of the number of accounts that can be bound to the mobile phone number in the target business, and the target business originates from the target entity;
  • the account attribute data set of the account number is acquired.
  • the clustering module is further configured to:
  • the account detection graph is subjected to graph clustering processing using the Connected Component algorithm to obtain multiple account groups;
  • account groups that include a number of mobile phone numbers greater than or equal to a predetermined number and are associated with the same login network address, and obtain a first account group combination;
  • account groups that include a number of mobile phone numbers greater than or equal to a predetermined number and are associated with physical addresses of the same wireless network card, and obtain a second account group combination;
  • the first account group combination and the second account group combination are determined as the account cluster.
  • the computing module is further configured to:
  • the product of the first frequency and the second frequency is used as the weight of each of the words.
  • the computing module is further configured to:
  • the determining module is further configured to:
  • the account cluster corresponding to the black product data document is determined as a black product account group.
  • the determining module is further configured to:
  • the account cluster corresponding to the black product data document is determined as a black product account group.
  • a computer device which performs all or part of the steps of any of the above-mentioned artificial intelligence-based methods for detecting fraudulent accounts.
  • the computer equipment includes:
  • the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to execute as illustrated in any of the above-described exemplary embodiments
  • the artificial intelligence-based black production account detection method is not limited to:
  • aspects of the present application may be implemented as a system, method or program product. Therefore, various aspects of the present application can be embodied in the following forms, namely: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", “module” or "system”.
  • a computer device 500 according to this embodiment of the present application is described below with reference to FIG. 5 .
  • the computer device 500 shown in FIG. 5 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
  • computer device 500 takes the form of a general-purpose computing device.
  • Components of the computer device 500 may include, but are not limited to, the above-mentioned at least one processing unit 510 , the above-mentioned at least one storage unit 520 , and a bus 530 connecting different system components (including the storage unit 520 and the processing unit 510 ).
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 510, so that the processing unit 510 executes various exemplary methods according to the present application described in the above-mentioned “Methods of Embodiments” of this specification. Implementation steps.
  • the processing unit 510 may execute step S110 as shown in FIG.
  • Step S120 use the attribute field data in the account attribute data set as a connection edge, and use the mobile phone number as a vertex to construct an account detection graph of the target subject
  • Step S130 based on the account detection graph Perform graph clustering on the accounts in the account detection diagram by using the attribute field data of the connecting edges in the middle to obtain a plurality of account clusters
  • step S140 use the attribute field data of each of the account clusters to generate each of the account clusters.
  • Step S150 The first field data file of the account cluster, and the second field data file of the whitelist account corresponding to the target subject is obtained;
  • Step S150 according to the first field data file and the second field data file, calculate The weight of each word in the first field data document, the weight indicating the importance of each word in the first field data document relative to the second field data document;
  • Step S160 Based on the weight of each of the words, a cluster of black production accounts is determined, and a group of black production accounts associated with the target subject is obtained.
  • the storage unit 520 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 5201 and/or a cache storage unit 5202 , and may further include a read only storage unit (ROM) 5203 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
  • the bus 530 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures. bus.
  • Computer device 500 may also communicate with one or more external devices 700 (eg, keyboards, pointing devices, Bluetooth devices, etc.), may also communicate with one or more devices that enable a user to interact with the computer device 500, and/or communicate with Any device (eg, router, modem, etc.) that enables the computer device 500 to communicate with one or more other computer devices. Such communication may take place through an input/output (I/O) interface 550 , which may also include a display unit 540 coupled to the input/output (I/O) interface 550 . Also, the computer device 500 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 560 .
  • networks eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet
  • network adapter 560 communicates with other modules of computer device 500 via bus 530 .
  • other hardware and/or software modules may be used in conjunction with computer device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on the network , including several instructions to cause a computer device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiment of the present application.
  • a computer device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium on which a program product capable of implementing the above-mentioned method of the present specification is stored, and the computer-readable storage medium may be non-volatile or easily accessible. loss of sex.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code, which is used to cause the program product to run on a terminal device when the program product is executed.
  • the terminal device performs the steps according to various exemplary embodiments of the present application described in the above-mentioned "Example Method" section of this specification.
  • a program product 600 for implementing the above method according to an embodiment of the present application is described, which can adopt a portable compact disk read only memory (CD-ROM) and include program codes, and can be used in a terminal device, For example running on a personal computer.
  • CD-ROM portable compact disk read only memory
  • the program product of the present application is not limited thereto, and in this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for carrying out the operations of the present application may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer device, partly on the user's computer device, as a stand-alone software package, partly on the user's computer device and partly on a remote computer device, or entirely on the remote computer device or execute on the server.
  • the remote computer equipment may be connected to the user computer equipment via any kind of network, including a local area network (LAN) or wide area network (WAN), or may be connected to external computer equipment (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Discrete Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé de détection de compte de l'industrie souterraine à base d'intelligence artificielle et un appareil associé. Le procédé comprend les étapes suivantes consistant à : obtenir, lorsqu'il est déterminé que le nombre de comptes liés à un numéro de téléphone mobile d'un utilisateur dépasse un nombre prédéterminé, l'ensemble de données d'attribut de compte d'un compte, l'utilisateur étant associé à un sujet cible (S110) ; utiliser des données de champ d'attribut dans l'ensemble de données d'attribut de compte en tant que bord de connexion et utiliser le numéro de téléphone mobile en tant que sommet pour construire un graphique de détection de compte du sujet cible (S120) ; réaliser un regroupement de graphes sur les comptes dans le graphique de détection de compte pour obtenir une pluralité de groupes de comptes (S130) ; utiliser les données de champ d'attribut de chaque groupe de comptes pour générer un premier document de données de champ et obtenir un second document de données de champ d'un compte de liste blanche correspondant au sujet cible (S140) ; calculer le poids de chaque mot dans le premier document de données de champ (S150) ; et déterminer un groupe de comptes de l'industrie souterraine en fonction du poids de chaque mot (S160). La solution concerne en outre le domaine des chaînes de blocs et l'ensemble de données d'attributs de compte peut être stocké dans une chaîne de blocs, ce qui permet d'améliorer efficacement la précision de détection de compte de l'industrie souterraine.
PCT/CN2021/090947 2020-07-31 2021-04-29 Procédé et appareil de détection de compte de l'industrie souterraine, dispositif informatique et support WO2022021977A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010763020.XA CN111931048B (zh) 2020-07-31 2020-07-31 基于人工智能的黑产账号检测方法及相关装置
CN202010763020.X 2020-07-31

Publications (1)

Publication Number Publication Date
WO2022021977A1 true WO2022021977A1 (fr) 2022-02-03

Family

ID=73315956

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090947 WO2022021977A1 (fr) 2020-07-31 2021-04-29 Procédé et appareil de détection de compte de l'industrie souterraine, dispositif informatique et support

Country Status (2)

Country Link
CN (1) CN111931048B (fr)
WO (1) WO2022021977A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114785546A (zh) * 2022-03-15 2022-07-22 上海聚水潭网络科技有限公司 一种基于业务日志和ip情报的ip溯源方法及系统
CN116846596A (zh) * 2023-05-31 2023-10-03 北京数美时代科技有限公司 一种恶意账号的识别方法、系统、介质及设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931048B (zh) * 2020-07-31 2022-07-08 平安科技(深圳)有限公司 基于人工智能的黑产账号检测方法及相关装置
CN113312560B (zh) * 2021-06-16 2023-07-25 百度在线网络技术(北京)有限公司 群组检测方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920947A (zh) * 2018-05-08 2018-11-30 北京奇艺世纪科技有限公司 一种基于日志图建模的异常检测方法和装置
CN109660513A (zh) * 2018-11-13 2019-04-19 微梦创科网络科技(中国)有限公司 一种基于Storm集群识别问题账号的方法及装置
CN109948641A (zh) * 2019-01-17 2019-06-28 阿里巴巴集团控股有限公司 异常群体识别方法及装置
US20190318359A1 (en) * 2018-04-17 2019-10-17 Mastercard International Incorporated Method and system for fraud prevention via blockchain
CN110620770A (zh) * 2019-09-19 2019-12-27 微梦创科网络科技(中国)有限公司 一种分析网络黑产账号的方法及装置
CN111931048A (zh) * 2020-07-31 2020-11-13 平安科技(深圳)有限公司 基于人工智能的黑产账号检测方法及相关装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2305912A1 (fr) * 2000-04-17 2001-10-17 Oxford Properties Group Inc. Service de repartition par teleappel sur internet
CN106372977B (zh) * 2015-07-23 2019-06-07 阿里巴巴集团控股有限公司 一种虚拟账户的处理方法和设备
RU2635275C1 (ru) * 2016-07-29 2017-11-09 Акционерное общество "Лаборатория Касперского" Система и способ выявления подозрительной активности пользователя при взаимодействии пользователя с различными банковскими сервисами
CN107798541B (zh) * 2016-08-31 2021-12-07 南京星云数字技术有限公司 一种用于在线业务的监控方法及系统
CN107657062A (zh) * 2017-10-25 2018-02-02 医渡云(北京)技术有限公司 相似病例检索方法及装置、存储介质、电子设备
CN109102301A (zh) * 2018-08-20 2018-12-28 阿里巴巴集团控股有限公司 一种支付风控方法及系统
CN109525595B (zh) * 2018-12-25 2021-04-16 广州方硅信息技术有限公司 一种基于时间流特征的黑产账号识别方法及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190318359A1 (en) * 2018-04-17 2019-10-17 Mastercard International Incorporated Method and system for fraud prevention via blockchain
CN108920947A (zh) * 2018-05-08 2018-11-30 北京奇艺世纪科技有限公司 一种基于日志图建模的异常检测方法和装置
CN109660513A (zh) * 2018-11-13 2019-04-19 微梦创科网络科技(中国)有限公司 一种基于Storm集群识别问题账号的方法及装置
CN109948641A (zh) * 2019-01-17 2019-06-28 阿里巴巴集团控股有限公司 异常群体识别方法及装置
CN110620770A (zh) * 2019-09-19 2019-12-27 微梦创科网络科技(中国)有限公司 一种分析网络黑产账号的方法及装置
CN111931048A (zh) * 2020-07-31 2020-11-13 平安科技(深圳)有限公司 基于人工智能的黑产账号检测方法及相关装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114785546A (zh) * 2022-03-15 2022-07-22 上海聚水潭网络科技有限公司 一种基于业务日志和ip情报的ip溯源方法及系统
CN114785546B (zh) * 2022-03-15 2024-04-26 上海聚水潭网络科技有限公司 一种基于业务日志和ip情报的ip溯源方法及系统
CN116846596A (zh) * 2023-05-31 2023-10-03 北京数美时代科技有限公司 一种恶意账号的识别方法、系统、介质及设备
CN116846596B (zh) * 2023-05-31 2024-01-30 北京数美时代科技有限公司 一种恶意账号的识别方法、系统、介质及设备

Also Published As

Publication number Publication date
CN111931048B (zh) 2022-07-08
CN111931048A (zh) 2020-11-13

Similar Documents

Publication Publication Date Title
WO2022021977A1 (fr) Procédé et appareil de détection de compte de l'industrie souterraine, dispositif informatique et support
CN110958220B (zh) 一种基于异构图嵌入的网络空间安全威胁检测方法及系统
US20200389495A1 (en) Secure policy-controlled processing and auditing on regulated data sets
US10079842B1 (en) Transparent volume based intrusion detection
US7631362B2 (en) Method and system for adaptive identity analysis, behavioral comparison, compliance, and application protection using usage information
US9323928B2 (en) System and method for non-signature based detection of malicious processes
US9237161B2 (en) Malware detection and identification
US10728264B2 (en) Characterizing behavior anomaly analysis performance based on threat intelligence
US10114960B1 (en) Identifying sensitive data writes to data stores
US20210112101A1 (en) Data set and algorithm validation, bias characterization, and valuation
US10805327B1 (en) Spatial cosine similarity based anomaly detection
CN108932426A (zh) 越权漏洞检测方法和装置
US20210136120A1 (en) Universal computing asset registry
CN111931047B (zh) 基于人工智能的黑产账号检测方法及相关装置
US11019494B2 (en) System and method for determining dangerousness of devices for a banking service
CN112784281A (zh) 一种工业互联网的安全评估方法、装置、设备及存储介质
US20230104176A1 (en) Using a Machine Learning System to Process a Corpus of Documents Associated With a User to Determine a User-Specific and/or Process-Specific Consequence Index
CN114760106A (zh) 网络攻击的确定方法、系统、电子设备及存储介质
US20210294850A1 (en) Monitoring information processing systems utilizing co-clustering of strings in different sets of data records
CN110955890B (zh) 恶意批量访问行为的检测方法、装置和计算机存储介质
US8402545B1 (en) Systems and methods for identifying unique malware variants
CN113037689A (zh) 基于日志的病毒发现方法、装置、计算设备及存储介质
CN115643044A (zh) 数据处理方法、装置、服务器及存储介质
CN113408579A (zh) 一种基于用户画像的内部威胁预警方法
CN111800409A (zh) 接口攻击检测方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21849086

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21849086

Country of ref document: EP

Kind code of ref document: A1