CN112199573B - Illegal transaction active detection method and system - Google Patents

Illegal transaction active detection method and system Download PDF

Info

Publication number
CN112199573B
CN112199573B CN202010776643.0A CN202010776643A CN112199573B CN 112199573 B CN112199573 B CN 112199573B CN 202010776643 A CN202010776643 A CN 202010776643A CN 112199573 B CN112199573 B CN 112199573B
Authority
CN
China
Prior art keywords
illegal
website
transaction
template
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010776643.0A
Other languages
Chinese (zh)
Other versions
CN112199573A (en
Inventor
卢子航
王峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baofu Network Technology Shanghai Co ltd
Original Assignee
Baofu Network Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baofu Network Technology Shanghai Co ltd filed Critical Baofu Network Technology Shanghai Co ltd
Priority to CN202010776643.0A priority Critical patent/CN112199573B/en
Publication of CN112199573A publication Critical patent/CN112199573A/en
Application granted granted Critical
Publication of CN112199573B publication Critical patent/CN112199573B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an illegal transaction active detection method and system, wherein the method comprises the following steps: screening illegal websites, matching templates by calculating the similarity between text information of the illegal websites and pre-marked templates, selecting program scripts of the matched templates to perform simulated registration, login and detection of transaction channels on the illegal websites, and extracting relevant information of transaction orders returned by the illegal websites through text analysis and mining and/or image recognition analysis to serve as a judging basis for judging whether the transaction behavior is legal or illegal. The application can effectively identify and early warn transaction risks in daily monitoring, realize rapid early warning of illegal crime platforms, realize early discovery and timely disposal, avoid further expansion of loss and improve the overall prevention and control level of financial risks.

Description

Illegal transaction active detection method and system
Technical Field
The application relates to the field of financial risk control, in particular to an illegal transaction active detection method and system.
Background
At present, in common payment risks, payment channels are often utilized by some illegal network stations and applications to trap users to conduct illegal transactions and acquire violence, so that economic losses are caused for the users.
Therefore, how to actively detect an illegal transaction platform, effectively identify and early warn transaction risks in daily monitoring, and realize early identification, early warning and early treatment of risks and improve the overall prevention and control level of financial risks is a problem to be solved urgently.
Disclosure of Invention
The application aims to provide an illegal transaction active detection method and system, which are used for solving the problems in the technical background.
In order to achieve the above purpose, the present application adopts the following technical scheme:
the first aspect of the application provides an illegal transaction active detection method, which comprises the following steps:
the method comprises the steps of screening illegal network stations through manually inputting URL and keywords of a website or through search engine query results, and storing website information of the illegal network stations into a database;
calculating the similarity between the text information of the illegal network station and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying the illegal network station of the history record, and each template is marked with a template code number;
if the calculated similarity is larger than a preset threshold value, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates into a database;
if the calculated similarity is smaller than or equal to a preset threshold value, the active detection software developed by the pre-marked template is used for carrying out simulated registration, login and detection of transaction channels on the illegal network station, relevant information of the transaction order returned by the illegal network station is extracted through text analysis and mining and/or image recognition analysis and is stored in a database to be used as a judging basis for judging whether the transaction behavior is legal or illegal.
Preferably, screening illegal network stations through results of search engine queries includes:
searching by a keyword through a search engine to obtain a corresponding suspected website URL;
keyword auditing is carried out on the source code content in the webpage;
and (5) taking the web address as an illegal website through the audited web address, and recording the web address into a database.
Preferably, the illegal transaction active detection method further comprises the following steps: after the illegal network station is simulated and registered, the corresponding website URL and registered user information (virtual data) are stored into a database for backup.
Preferably, the website information of the illegal website screened out by manually entering the URL, the keyword of the website or the result of the search engine query comprises at least one of the following: website name, website URL, website validity (whether or not it can be opened), text information of website, website snapshot picture URL, website mark template code, website creation time, website update time.
Preferably, the similarity is a hamming distance between a SimHash value of the web page source code of the illegal website and a SimHash value of a pre-marked template.
Preferably, the predetermined threshold is an empirical value, preferably 15.
Preferably, the related information of the trade order includes at least one of: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
Preferably, the illegal transaction active detection method further comprises the following steps: supporting distributed task distribution processing, adopting a browser dock cluster, using a Sepium Grid to realize page rendering and simulation operation, uniformly calling the Sepium Hub to distribute tasks to at least one Node proxy Node registered on the Sepium Hub, requesting illegal network stations by the plurality of Node proxy nodes, completing simulation registration, login and transaction actions, and receiving transaction orders returned by the illegal network stations.
Preferably, the illegal transaction active detection method further comprises the following steps: the IP address of the simulated user for the simulated registration of the illegal network station is dynamically configured.
Preferably, when the illegal network station performs simulated registration, login and detection of a transaction channel, the illegal transaction active detection method further comprises the following steps: and generating monitoring log information and storing the monitoring log information into a database.
A second aspect of the present application provides an active detection system for illegal transactions, comprising:
-a database storing a website base data table, an order monitoring result table, and a simulated registration log-in information table; wherein,
the website basic data table is used for storing website information of illegal websites screened out by manually inputting URL and keywords of the websites or by searching results queried by a search engine;
the order monitoring result table is used for storing related information of the transaction order obtained by actively detecting the illegal network station;
the simulated registration login information table is used for storing website URLs and registered user information (virtual data) corresponding to simulated registration and login of illegal network stations;
the illegal transaction platform positioning module is used for screening out illegal network stations through manually inputting URL and keywords of websites or through search engine query results;
the template marking module is used for calculating the similarity between the text information of the illegal network station and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying the illegal network station of the history record, and each template is marked with a template code number; if the calculated similarity is larger than a preset threshold, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates into a website basic data table;
the distributed task distribution module is used for carrying out simulated registration, login and detection of transaction channels on illegal websites with similarity smaller than or equal to a preset threshold value calculated by the marking template module and receiving transaction orders returned by the illegal websites by using active detection software developed by the pre-marked template;
the text analysis module is used for carrying out text analysis mining on text information in the transaction order received by the distributed task distribution module and storing the text information in the order monitoring result table;
the image recognition analysis module is used for carrying out image recognition analysis on the image information in the transaction order received by the distributed task distribution module and storing the image information in the order monitoring result table.
Preferably, the website information stored in the website base data table includes at least one of the following: website name, website URL, website validity (whether or not it can be opened), text information of website, website snapshot picture URL, website mark template code, website creation time, website update time.
Preferably, the related information of the trade order stored in the order monitoring result table includes at least one of the following: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
Preferably, the marking template module includes:
the extraction unit is used for extracting the pattern fingerprints in the illegal network station;
a calculation unit for calculating the similarity between the pattern fingerprint in the illegal website extracted by the extraction unit and the pre-marked template;
and the determining unit is used for determining that the illegal website can use active detection software developed by a pre-marked template to perform simulated registration, login and transaction channel detection on the illegal website when the similarity calculated by the calculating unit is smaller than or equal to a preset threshold value.
More preferably, the similarity calculated by the calculation unit is a hamming distance between a SimHash value of the web page source code of the illegal website and a SimHash value of a pre-marked template, that is, the extraction unit extracts a style fingerprint in the illegal website as the SimHash value of the web page source code of the illegal website.
More preferably, the predetermined threshold is an empirical value, preferably 15.
Preferably, the distributed task distribution module supports distributed task distribution processing, adopts a browser dock cluster, uses a Selenium Grid to realize page rendering and simulation operation, uniformly calls a Selenium Hub to distribute tasks to at least one Node proxy Node registered on the Selenium Hub, requests illegal network stations by the plurality of Node proxy nodes, completes simulation registration, login and transaction actions, and receives transaction orders returned by the illegal network stations.
Preferably, the database further comprises a monitoring log information table, which is used for storing monitoring log information generated when the illegal network station performs simulated registration, login and detection of a transaction channel.
Compared with the prior art, the technical scheme of the application has the following beneficial effects:
the application provides an illegal transaction active detection method and system, wherein the system takes the website address of an illegal transaction platform as input, and the output is characteristic information of the illegal transaction platform and illegal transaction order information, and through active detection of an illegal network station, the illegal transaction risk can be effectively identified and early-warned in daily monitoring, the rapid early warning of an illegal criminal platform is realized, the illegal criminal platform is discovered as early as possible and is disposed in time, the loss is avoided to be further enlarged, and the integral prevention and control level of financial risk is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a block diagram of an active detection system for illegal transactions of the preferred embodiment;
FIG. 2 is a schematic diagram of a website information list of illegal websites screened out by the illegal transaction platform positioning module according to the preferred embodiment;
FIG. 3 is a schematic flow chart of the SimHash algorithm;
FIG. 4 is an architectural diagram of a distributed task distribution process;
FIG. 5 is a schematic diagram of a Selenium Grid distributed task node;
FIG. 6 is a schematic diagram of a page of the illegal network station blocking IP of the preferred embodiment;
fig. 7 is a flow chart of an illegal transaction active detection system.
Detailed Description
In order to make the objects, technical solutions and effects of the present application clearer and more obvious, the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It is noted that the terms "first," "second," and the like in the description and claims of the present application and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order, and it is to be understood that the data so used may be interchanged where appropriate. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a block diagram of an active detection system for illegal transactions. As shown in fig. 1, the illegal transaction active detection system comprises a database 1, an illegal transaction platform positioning module 2, a marking template module 3, a distributed task distribution module 4, a text analysis module 5 and an image recognition analysis module 6.
1. Database for storing data
The database 1 stores a website base data table 101, an order monitoring result table 102, and a simulation registration information table 103.
The website basic data table 101 is used for storing website information of illegal websites screened out by manually inputting URLs and keywords of websites or by searching results queried by a search engine. The website information stored in the website base data table 101 includes at least one of the following: website name, website URL, website availability (whether or not openable), text information of the website, website snapshot picture URL, website markup template code, website creation time, website update time, as shown in fig. 2.
The order monitoring result table 102 is configured to store information about a transaction order obtained by actively detecting an illegal website, for example, an order number, an order screenshot, a bank of the transaction, a transaction time, a website URL, a transaction amount, account information of a payee, and the like.
The simulated registration login information table 103 is used for storing website URLs and registered user information (virtual data) corresponding to simulated registration and login of illegal network stations, so that the illegal network stations can be read and used next time.
Preferably, the database 1 further stores a monitoring log information table 104, which is used for storing monitoring log information generated when the illegal network station performs simulated registration, logging and detection of a transaction channel.
2. Illegal transaction platform positioning module
The illegal transaction platform positioning module 2 is used for screening illegal network stations through manually inputting URL and keywords of websites or through search engine query results.
Taking searching and locating a certain type of illegal website as an example, two main ways are: manually enter and search engine queries.
And (3) manual input: and (5) manually inputting the URL and the keywords of the illegal website. Wherein the keywords are manually preset.
Search engine query: illegal web sites have some characteristics such as easy change of web site URL, multiple domain name resolution to the same web site, unstable accessibility of web site itself, etc. Generally, the human resources that can provide suspected websites are limited, so an illegal transaction platform positioning module is required to automatically discover the suspected websites and monitor them in real time, i.e., query through a search engine. The search engine inquiry needs to use keywords, so that an administrator of the system carries out configuration of the keywords and keyword deformed words, then the search engine inquiry is carried out through the keywords to obtain corresponding suspected illegal website URLs, then keyword auditing is carried out on source code contents in the webpages, and the website is regarded as an illegal website and is input into the website basic data table 101 of the database 1 through the audited websites.
3. Marking template module
Because of the huge number of illegal transaction platforms, hundreds or thousands of illegal transaction platforms may exist, and the efficiency of developing and maintaining the system is greatly different from the newly increased and changed speeds of illegal network stations, so that the burden of a developer of the system is increased. In order to solve the contradiction, a distributed network system of a marking template is required to be designed, and the system can automatically select an action instance according to the accessed webpage information of the illegal transaction platform to complete the monitoring task.
The marking template module 3 is configured to calculate similarity between text information of the illegal network station and a pre-marked template, where the pre-marked template is a different preset model generated by classifying the illegal network station of the history record, and each template is marked with a template code number; if the calculated similarity is greater than a predetermined threshold, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates in the website base data table 101.
Common algorithms for calculating the similarity of texts include SimHash algorithm, machine learning clustering algorithm, reverse construction of XPath according to a Dom tree, and Kmeans clustering method based on SimHash improvement. Of course, the method for classifying websites by calculating the similarity is not limited thereto, and any algorithm that can implement website classification by calculating the similarity should be covered in the protection scope of the present application.
The similarity calculation is performed by taking the SimHash algorithm as an example.
SimHash is the most commonly used hash method for web page deduplication, and is fast, and similarity between documents is compared according to Hamming distance. The SimHash algorithm flow is shown in FIG. 3, and the algorithm process is as follows:
and extracting keywords (including segmentation and weight calculation) from the document Doc, and extracting n pairs (keywords, weights) in the figure. Denoted feature_weight_pairs= [ fw1, fw2.. fwn ], where fwn = (feature_n, weight_n), n is a natural number greater than 1.
The hash_weight_pairs= [ (hash (feature), weight) for feature, weight in feature _weight_pairs ] generate (hash, weight) in the graph, and the number of bits generated by the hash is assumed to be bits_count=6 (see fig. 3).
Then, the hash_weight_pairs are bit-wise accumulated, and if the bit is 1, +weight, and if 0, —weight, the bits_count numbers are finally generated, as shown by [13,108, -22, -5, -32,55], where the values generated are related to the algorithm used by the hash function.
Positive numbers are represented by 1, negative numbers are represented by 0, and [13,108, -22, -5, -32,55] is converted into binary strings 110001, namely SimHash values of the document Doc.
And calculating the similarity between the two documents, respectively calculating the SimHash values of the two documents, and then calculating the Hamming distance between the two SimHash values.
For example, the SimHash value of document A is: a=100111;
the SimHash value of document B is: b=101010;
the Hamming distance of two SimHash values is calculated, namely the number of 1 in the binary system after A XOR B: hamming_distance (a, B) =count_1 (a XOR B) =count_1 (001101) =3;
after SimHash values of all documents are calculated, the condition that whether document a and document B are similar or not needs to be calculated is: whether the Hamming distance of A and B is less than or equal to n, which can be empirically taken.
Specifically, in a preferred embodiment, multiple illegal websites use the same set of H5 front-end interfaces, and the websites can be categorized by the SimHash algorithm, and corresponding example actions are selected for monitoring.
Assuming that a certain illegal website active detection code is developed and completed, marking a template code of a website mark as a template A, and calculating the SimHash value of the current illegal website page source code docA after calculating the SimHash value of the current illegal website page source code docA, wherein the condition that whether the doc A and the template A are similar or not is required to be calculated is as follows: whether the Hamming distance between the SimHash values of doc A and template A is less than or equal to n, which is typically a value of 15 according to experience.
And if the value of n is less than or equal to 15, the current illegal website is judged to be monitored by using the active detection code developed by the template A. By marking templates and classifying different illegal network stations, the workload of development can be reduced, and the effect of half effort is achieved.
In a preferred embodiment, the procedure for classifying illegal websites using SimHash algorithm is as follows:
1) In the process, the SimHash value of the common website is marked: automatically accessing the websites through a search engine or manually entered URL, manually observing front-end characteristics, selecting a plurality of websites with high occurrence frequency of the same front-end characteristics as one type, developing an automatic script, and simultaneously recording SimHash values of the developed websites for comparison;
2) And (3) calculating the SimHash value of each website by using a SimHash processing program, comparing the SimHash value with marked illegal websites, if similar websites appear, marking corresponding classification labels, warehousing, and selecting a corresponding script program to detect a payment channel.
4. Distributed task distribution module
The illegal transaction active detection system is deployed on a plurality of servers in a distributed environment, a load balancing implementation scheme is needed to be obtained, the task distribution balance in the distributed environment is ensured, the processing efficiency is improved, and single-point faults are avoided, as shown in fig. 4.
Distributed task queues: a distributed system is a system in which multiple programs of multiple machines process multiple URLs simultaneously. The distributed mode can greatly improve the efficiency of the program. Composition of distributed task queues: the Broker, the container in which the message queues are stored, is typically provided by a third party message queuing mechanism, such as RabbitMQ, redis. Tasks, typically written in a script, function as producers, are used to generate messages. The Worker, the consumer, obtains the message from the Broker and processes it.
Distributed task nodes: taking a certain type of illegal website as an example, because illegal network stations monitor a plurality of tasks and a marking template is adopted to classify the tasks, in the distributed task distribution module, a browser Docker cluster is preferentially adopted, and the web page rendering and simulation operation are realized by using a Selenium Grid. The monitoring system calls the Selenium Hub in a unified way, a plurality of Node proxy nodes are registered on the Selenium Hub, a issuing mechanism is established between the Selenium Hub and the plurality of Node proxy nodes, tasks are distributed by the Selenium Hub, websites are requested by the plurality of Node proxy nodes to complete the actions of simulating registration, login and transaction, corresponding webpage information source codes are returned, and the actions enter a text analysis module and an image recognition analysis module to be processed, as shown in fig. 5.
In addition, in actual work, the condition that the system IP is blocked is frequently encountered, and for this purpose, the illegal transaction active detection system needs a lot of IPs to realize the continuous switching of own IP addresses, so as to achieve the purpose of normal monitoring.
5. Text analysis module and image recognition analysis module
The text analysis module 5 is configured to identify and extract text information in the transaction order received by the distributed task distribution module 4, and store the text information in the order monitoring result table 101.
The image recognition analysis module 6 is configured to recognize and extract image information in the transaction order received by the distributed task distribution module 4, and store the image information in the order monitoring result table 101.
The related information of the transaction orders extracted by the text analysis module 5 and the image recognition analysis module 6 may include an order number, an order screenshot, a bank of the transaction, transaction time, a website URL, transaction amount, account information of a payee, etc., which may be used as a criterion for judging whether the transaction is legal or illegal, so as to timely and effectively discover illegal transaction orders of illegal websites, timely control the display of an interactive interface in the background, report the result to a risk control business department for subsequent processing, and if necessary, automatically implement an early warning measure according to an early warning rule to intervene, where the early warning rule includes a series of data such as time, place, website where the early warning occurs, frequency where the early warning occurs, amount of money, etc.
Fig. 7 is a flow chart of an illegal transaction active detection system.
As shown in fig. 7, the main flow of the illegal transaction active detection system of the present application is:
the illegal transaction platform positioning module screens out illegal network stations through manually inputting URL and keywords of websites or through search engine query results, and stores website information of the illegal network stations into a website basic data table of a database;
the marking template module calculates the similarity between the text information of the illegal network stations stored in the website basic data table and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying the illegal network stations of the history record, and each template is marked with a template code number;
if the calculated similarity is larger than a preset threshold, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates into a website basic data table in a database;
if the calculated similarity is smaller than or equal to a preset threshold value, running active detection software developed by a pre-marked template to perform simulated registration, login and transaction channel detection on the illegal network station; monitoring log information generated in the active detection process is stored in a monitoring log information table in a database; in the active detection process, the simulated registration, the website URL of the login and the registered user information (virtual data) are stored into a simulated registration login information table of a database for backup;
and respectively extracting the transaction orders returned by the illegal network stations received in the active detection process through a text analysis module and/or an image recognition analysis module, and storing the transaction orders into an order monitoring result table of a database to be used as a judging basis for judging whether the transaction behaviors are legal or illegal. For abnormal transactions and high risk merchants found during the monitoring process, treatment measures such as accountability improvement, risk level adjustment, limiting transactions, closing settlement, reporting to regulatory authorities, etc. can be taken.
In summary, the application provides an illegal transaction active detection method and system, the system takes the website address of an illegal transaction platform as input, and the output is the characteristic information and the illegal transaction order information of the illegal transaction platform, and through the active detection of the illegal network station, the transaction risk can be effectively identified and early-warned in daily monitoring, the rapid early warning of the illegal crime platform is realized, the early discovery and the timely disposal are realized, the loss is avoided to be further enlarged, and the integral prevention and control level of the financial risk is improved.
The above description of the specific embodiments of the present application has been given by way of example only, and the present application is not limited to the above described specific embodiments. Any equivalent modifications and substitutions for the present application will occur to those skilled in the art, and are also within the scope of the present application. Accordingly, equivalent changes and modifications are intended to be included within the scope of the present application without departing from the spirit and scope thereof.

Claims (9)

1. An active detection method for illegal transactions, which is characterized by comprising the following steps:
the method comprises the steps of screening illegal network stations through manually inputting URL and keywords of a website or through search engine query results, and storing website information of the illegal network stations into a database;
calculating the similarity between the text information of the illegal network station and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying the illegal network station of the history record, and each template is marked with a template code number;
if the calculated similarity is larger than a preset threshold value, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates into a database;
if the calculated similarity is smaller than or equal to a preset threshold value, the active detection software developed by the pre-marked template is used for carrying out simulated registration, login and detection of transaction channels on the illegal network station, relevant information of a transaction order returned by the illegal network station is extracted through text analysis and mining and/or image recognition analysis and is stored in a database to be used as a judging basis for judging whether the transaction behavior is legal or illegal;
the method for detecting the illegal network station by using active detection software developed by a pre-marked template comprises the following steps of:
supporting distributed task distribution processing, adopting a browser dock cluster, using a Sepium Grid to realize page rendering and simulation operation, uniformly calling the Sepium Hub to distribute tasks to at least one Node proxy Node registered on the Sepium Hub, requesting illegal network stations by the plurality of Node proxy nodes, completing simulation registration, login and transaction actions, and receiving transaction orders returned by the illegal network stations.
2. The method for actively detecting illegal transactions according to claim 1, wherein: and after the illegal network station is simulated and registered, storing the corresponding website URL and registered user information into a database for backup.
3. The method for actively detecting illegal transactions according to claim 1, wherein: website information of illegal websites screened out by manually entering URLs and keywords of websites or by search engine query results comprises at least one of the following: website name, website URL, website validity, text information of website, website snapshot picture URL, website mark template code number, website creation time, website update time.
4. The method for actively detecting illegal transactions according to claim 1, wherein: the similarity is a Hamming distance between the SimHash value of the webpage source code of the illegal website and the SimHash value of the pre-marked template.
5. The method of claim 1, wherein the information related to the trade order includes at least one of: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
6. The method for actively detecting illegal transactions according to claim 1, wherein: and when the illegal network station is subjected to simulated registration, login and detection of a transaction channel, monitoring log information is generated, and the monitoring log information is stored in a database.
7. An active detection system for illegal transactions, comprising:
-a database storing a website base data table, an order monitoring result table, and a simulated registration log-in information table; wherein,
the website basic data table is used for storing website information of illegal websites screened out by manually inputting URL and keywords of the websites or by searching results queried by a search engine;
the order monitoring result table is used for storing related information of the transaction order obtained by actively detecting the illegal network station;
the simulated registration login information table is used for storing website URLs and registered user information corresponding to simulated registration and login of illegal network stations;
the illegal transaction platform positioning module is used for screening out illegal network stations through manually inputting URL and keywords of websites or through search engine query results;
the template marking module is used for calculating the similarity between the text information of the illegal network station and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying the illegal network station of the history record, and each template is marked with a template code number; if the calculated similarity is larger than a preset threshold, classifying the illegal network stations, marking new template codes, and storing the new templates as newly added templates into a website basic data table;
the distributed task distribution module is used for carrying out simulated registration, login and detection of transaction channels on illegal websites with similarity smaller than or equal to a preset threshold value calculated by the marking template module and receiving transaction orders returned by the illegal websites by using active detection software developed by the pre-marked template;
the text analysis module is used for carrying out text analysis mining on text information in the transaction order received by the distributed task distribution module and storing the text information in the order monitoring result table;
the image recognition analysis module is used for carrying out image recognition analysis on the image information in the transaction order received by the distributed task distribution module and storing the image information in the order monitoring result table.
8. The system of claim 7, wherein the marking template module comprises:
the extraction unit is used for extracting the pattern fingerprints in the illegal network station;
a calculation unit for calculating the similarity between the pattern fingerprint in the illegal website extracted by the extraction unit and the pre-marked template;
and the determining unit is used for determining that the illegal website can use active detection software developed by a pre-marked template to perform simulated registration, login and transaction channel detection on the illegal website when the similarity calculated by the calculating unit is smaller than or equal to a preset threshold value.
9. An illegal transaction active detection system according to claim 7, characterized in that: the database also comprises a monitoring log information table which is used for storing monitoring log information generated when the illegal network station is subjected to simulated registration, login and detection of a transaction channel.
CN202010776643.0A 2020-08-05 2020-08-05 Illegal transaction active detection method and system Active CN112199573B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010776643.0A CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010776643.0A CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Publications (2)

Publication Number Publication Date
CN112199573A CN112199573A (en) 2021-01-08
CN112199573B true CN112199573B (en) 2023-12-08

Family

ID=74006145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010776643.0A Active CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Country Status (1)

Country Link
CN (1) CN112199573B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966263A (en) * 2021-02-25 2021-06-15 中国银联股份有限公司 Target information acquisition method and device and computer readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996041488A1 (en) * 1995-06-07 1996-12-19 The Dice Company Fraud detection system for electronic networks using geographical location coordinates
CN101383820A (en) * 2008-07-07 2009-03-11 上海安融信息系统有限公司 Design and implementing method for SSL connection and data monitoring
KR20090090641A (en) * 2008-02-21 2009-08-26 주식회사 조은시큐리티 System for active security surveillance
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
CN106302438A (en) * 2016-08-11 2017-01-04 国家计算机网络与信息安全管理中心 A kind of method of actively monitoring fishing website of Behavior-based control feature by all kinds of means
CN107733969A (en) * 2017-07-25 2018-02-23 上海壹账通金融科技有限公司 Website simulation login method, device, service end and readable storage medium storing program for executing
CN107786537A (en) * 2017-09-19 2018-03-09 杭州安恒信息技术有限公司 A kind of lonely page implantation attack detection method based on internet intersection search
US10108968B1 (en) * 2014-03-05 2018-10-23 Plentyoffish Media Ulc Apparatus, method and article to facilitate automatic detection and removal of fraudulent advertising accounts in a network environment
CN110020075A (en) * 2017-10-20 2019-07-16 南京烽火软件科技有限公司 Device is excavated in illegal website automatically
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet
CN110413908A (en) * 2018-04-26 2019-11-05 维布络有限公司 The method and apparatus classified based on web site contents to uniform resource locator

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996041488A1 (en) * 1995-06-07 1996-12-19 The Dice Company Fraud detection system for electronic networks using geographical location coordinates
KR20090090641A (en) * 2008-02-21 2009-08-26 주식회사 조은시큐리티 System for active security surveillance
CN101383820A (en) * 2008-07-07 2009-03-11 上海安融信息系统有限公司 Design and implementing method for SSL connection and data monitoring
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
US10108968B1 (en) * 2014-03-05 2018-10-23 Plentyoffish Media Ulc Apparatus, method and article to facilitate automatic detection and removal of fraudulent advertising accounts in a network environment
CN106302438A (en) * 2016-08-11 2017-01-04 国家计算机网络与信息安全管理中心 A kind of method of actively monitoring fishing website of Behavior-based control feature by all kinds of means
CN107733969A (en) * 2017-07-25 2018-02-23 上海壹账通金融科技有限公司 Website simulation login method, device, service end and readable storage medium storing program for executing
CN107786537A (en) * 2017-09-19 2018-03-09 杭州安恒信息技术有限公司 A kind of lonely page implantation attack detection method based on internet intersection search
CN110020075A (en) * 2017-10-20 2019-07-16 南京烽火软件科技有限公司 Device is excavated in illegal website automatically
CN110413908A (en) * 2018-04-26 2019-11-05 维布络有限公司 The method and apparatus classified based on web site contents to uniform resource locator
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于URL特征检测的违法网站识别方法;凡友荣;杨涛;王永剑;姜国庆;;《计算机工程》(第3期);176-182 *
基于主动探测的仿冒网站检测系统设计与实现;魏玉良;《中国优秀硕士学位论文全文数据库信息科技辑》(第2期);I138-1244 *

Also Published As

Publication number Publication date
CN112199573A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
CN110602029B (en) Method and system for identifying network attack
US8453027B2 (en) Similarity detection for error reports
CN108566399B (en) Phishing website identification method and system
CN104598367A (en) System and method for automatically managing fault events of data center
EP3872637A1 (en) Application programming interface assessment
CN117473512B (en) Vulnerability risk assessment method based on network mapping
US11836331B2 (en) Mathematical models of graphical user interfaces
US11822578B2 (en) Matching machine generated data entries to pattern clusters
KR102257139B1 (en) Method and apparatus for collecting information regarding dark web
JP2016192185A (en) Spoofing detection system and spoofing detection method
CN110689211A (en) Method and device for evaluating website service capability
CN113918794B (en) Enterprise network public opinion benefit analysis method, system, electronic equipment and storage medium
CN116186716A (en) Security analysis method and device for continuous integrated deployment
CN112199573B (en) Illegal transaction active detection method and system
CN108804501B (en) Method and device for detecting effective information
CN113688346A (en) Illegal website identification method, device, equipment and storage medium
Naidu et al. Analysis of Hadoop log file in an environment for dynamic detection of threats using machine learning
CN116318974A (en) Site risk identification method and device, computer readable medium and electronic equipment
KR100992069B1 (en) A system for preventing exposure of personal information on the internet and the method thereof
JP2020095452A (en) Vocabulary extraction support system and vocabulary extraction support method
Vyawhare et al. Machine Learning System for Malicious Website Detection using Concept Drift Detection
Sakai et al. An Automatic Detection System for Fake Japanese Shopping Sites Using fastText and LightGBM
TWI726455B (en) Penetration test case suggestion method and system
CN118276933B (en) Method, device, equipment and medium for processing software compatibility problem
Eljialy et al. Errors Detection Mechanism in Big Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant