CN112199573A - Active detection method and system for illegal transaction - Google Patents

Active detection method and system for illegal transaction Download PDF

Info

Publication number
CN112199573A
CN112199573A CN202010776643.0A CN202010776643A CN112199573A CN 112199573 A CN112199573 A CN 112199573A CN 202010776643 A CN202010776643 A CN 202010776643A CN 112199573 A CN112199573 A CN 112199573A
Authority
CN
China
Prior art keywords
illegal
website
transaction
template
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010776643.0A
Other languages
Chinese (zh)
Other versions
CN112199573B (en
Inventor
卢子航
王峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baofu Network Technology Shanghai Co ltd
Original Assignee
Baofu Network Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baofu Network Technology Shanghai Co ltd filed Critical Baofu Network Technology Shanghai Co ltd
Priority to CN202010776643.0A priority Critical patent/CN112199573B/en
Publication of CN112199573A publication Critical patent/CN112199573A/en
Application granted granted Critical
Publication of CN112199573B publication Critical patent/CN112199573B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The application provides an illegal transaction active detection method and system, wherein the method comprises the following steps: screening illegal websites, matching templates by calculating the similarity between the text information of the illegal websites and the templates marked in advance, selecting program scripts of the matched templates to carry out simulated registration, login and detection of transaction channels on the illegal websites, and extracting relevant information of transaction orders returned by the illegal websites through text analysis mining and/or image recognition analysis to serve as a basis for judging whether the transaction behaviors are legal or illegal. This application can effectively discern, early warning transaction risk in daily control, realizes accomplishing as early as possible discovery, in time dealing with to the quick early warning of illegal criminal platform, avoids the loss further to enlarge, has promoted the whole prevention and control level of financial risk.

Description

Active detection method and system for illegal transaction
Technical Field
The invention relates to the field of financial risk control, in particular to an illegal transaction active detection method and system.
Background
At present, in common payment risks, payment channels are often utilized by illegal websites and applications, users are tricked into carrying out illegal transactions, violence is obtained, and economic losses are caused for the users.
Therefore, how to actively detect the illegal transaction platform, effectively identify and early warn the transaction risk in daily monitoring, realize early identification of risk, early warning, early disposal and improve the overall prevention and control level of financial risk is a problem to be solved urgently.
Disclosure of Invention
The present invention is directed to provide an active illegal transaction detection method and system, so as to solve the problems set forth in the above background.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present application provides an active detection method for illegal transactions, including:
screening out illegal websites by manually inputting URL (uniform resource locator) and key words of the websites or by searching results of an engine, and storing website information of the illegal websites into a database;
calculating the similarity between the text information of the illegal websites and pre-marked templates, wherein the pre-marked templates are different preset models generated by classifying historical illegal websites, and each template is marked with a template code;
if the calculated similarity is larger than a preset threshold value, classifying the illegal website, marking a new template code number, and storing the template code number as a newly added template in a database;
if the calculated similarity is smaller than or equal to a preset threshold value, using active detection software developed by a pre-marked template to perform simulated registration, login and detection of a transaction channel on the illegal website, extracting relevant information of a transaction order returned by the illegal website through text analysis mining and/or image recognition analysis, and storing the relevant information in a database as a basis for judging whether the transaction behavior is legal or illegal.
Preferably, screening out illegitimate websites through the results of search engine queries includes:
searching engine query through the keywords to obtain corresponding suspected website URL;
performing keyword examination on the source code content in the webpage;
and (4) the checked website is regarded as an illegal website and is recorded into the database.
Preferably, the illegal transaction active detection method further comprises: and after the illegal website is simulated and registered, storing the corresponding website URL and the registered user information (virtual data) into a database for backup.
Preferably, the website information of the illegal website screened out by manually entering the URL, the keyword of the website, or the result of the search engine query includes at least one of the following: website name, website URL, website validity (whether the website can be opened or not), text information of the website, website snapshot picture URL, website mark template code number, website creation time and website update time.
Preferably, the similarity is a hamming distance between a SimHash value of a web page source code of the illegal website and a SimHash value of a pre-marked template.
Preferably, the predetermined threshold is an empirical value, preferably 15.
Preferably, the information relating to the trade order comprises at least one of: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
Preferably, the illegal transaction active detection method further comprises: the distributed task distribution processing is supported, a browser Docker cluster is adopted, a Selenium Grid is used for realizing page rendering and simulation operation, a Selenium Hub is called uniformly to distribute tasks to at least one Node proxy Node registered on the Selenium Hub, a plurality of Node proxy nodes request an illegal website to complete simulation registration, login and transaction actions, and a transaction order returned by the illegal website is received.
Preferably, the illegal transaction active detection method further comprises: and the IP address of the simulated user for performing simulated registration on the illegal website is dynamically configured.
Preferably, when the illegal website is subjected to simulated registration, login and detection of a transaction channel, the illegal transaction active detection method further comprises the following steps: and generating monitoring log information and storing the monitoring log information into a database.
A second aspect of the present application provides an active illegal transaction detection system, comprising:
a database storing a website basic data table, an order monitoring result table, and a simulation registration login information table; wherein the content of the first and second substances,
the website basic data table is used for storing website information of illegal websites screened out by manually inputting URLs and key words of the websites or by results of search engine query;
the order monitoring result table is used for storing the related information of the transaction order obtained by actively detecting the illegal website;
the simulation registration login information table is used for storing simulation registration of illegal websites, corresponding website URLs and registered user information (virtual data) during login;
the illegal trading platform positioning module is used for screening out illegal websites by manually inputting URL and key words of the websites or by searching results of an engine;
the template marking module is used for calculating the similarity between the text information of the illegal website and a template marked in advance, wherein the template marked in advance is different preset models generated by classifying the illegal websites recorded in history, and each template is marked with a template code; if the calculated similarity is larger than a preset threshold value, classifying the illegal website, marking a new template code number, and storing the template code number as a newly added template into a website basic data table;
the distributed task distribution module is used for performing simulated registration, login and transaction channel detection on the illegal website with the similarity calculated in the marking template module being less than or equal to a preset threshold value by using active detection software developed by a pre-marked template, and receiving a transaction order returned by the illegal website;
the text analysis module is used for performing text analysis mining on the text information in the transaction order received by the distributed task distribution module and storing the text information into the order monitoring result table;
the image recognition analysis module is used for carrying out image recognition analysis on the image information in the transaction order received by the distributed task distribution module and storing the image information in the order monitoring result table.
Preferably, the website information stored in the website basic data table includes at least one of: website name, website URL, website validity (whether the website can be opened or not), text information of the website, website snapshot picture URL, website mark template code number, website creation time and website update time.
Preferably, the information related to the trade orders stored in the order monitoring result table includes at least one of the following: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
Preferably, the marking template module includes:
an extraction unit for extracting pattern fingerprints in an illegal website;
the calculating unit is used for calculating the similarity between the pattern fingerprint in the illegal website extracted by the extracting unit and the template marked in advance;
and the determining unit is used for determining that the illegal website can use active detection software developed by a pre-marked template to perform simulated registration, login and detection of a transaction channel on the illegal website when the similarity calculated by the calculating unit is less than or equal to a preset threshold.
More preferably, the similarity calculated by the calculating unit is a hamming distance between a SimHash value of the web page source code of the illegal website and a SimHash value of a pre-marked template, that is, the extracting unit extracts the pattern fingerprint in the illegal website as the SimHash value of the web page source code of the illegal website.
More preferably, the predetermined threshold is an empirical value, preferably 15.
Preferably, the distributed task distribution module supports distributed task distribution processing, a browser Docker cluster is adopted, a Selenium Grid is used for realizing page rendering and simulation operations, the Selenium Hub is called uniformly to distribute tasks to at least one Node proxy Node registered on the Selenium Hub, a plurality of Node proxy nodes request an illegal website to complete simulated registration, login and transaction actions, and a transaction order returned by the illegal website is received.
Preferably, the database further includes a monitoring log information table for storing monitoring log information generated when the illegal website is subjected to simulated registration, login, and detection of a transaction channel.
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
the system takes an illegal transaction platform website address as input, outputs characteristic information and illegal transaction order information of an illegal transaction platform, can effectively identify and early warn transaction risks in daily monitoring through active detection of the illegal website, realizes quick early warning of an illegal crime platform, finds and timely disposes the illegal transaction risk as soon as possible, avoids loss from further expansion, and improves the overall prevention and control level of financial risk.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a block diagram of an illegal transaction active detection system of the preferred embodiment;
FIG. 2 is a schematic diagram of a website information list of illegal websites screened by the illegal transaction platform positioning module according to the preferred embodiment;
FIG. 3 is a schematic diagram of the SimHash algorithm flow;
FIG. 4 is an architectural diagram of a distributed task distribution process;
FIG. 5 is a schematic diagram of a Selenium Grid distributed task node;
FIG. 6 is a schematic diagram of a page of the illegal website block IP of the preferred embodiment;
fig. 7 is a schematic flow diagram of an illegal transaction active probing system.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order, it being understood that the data so used may be interchanged under appropriate circumstances. Furthermore, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a block diagram of an illegal transaction active detection system. As shown in fig. 1, the illegal transaction active detection system includes a database 1, an illegal transaction platform positioning module 2, a marking template module 3, a distributed task distribution module 4, a text analysis module 5, and an image recognition analysis module 6.
1. Database with a plurality of databases
The database 1 stores a website basic data table 101, an order monitoring result table 102, and a simulation registration login information table 103.
The website basic data table 101 is used for storing website information of illegal websites screened out by manually inputting URLs and keywords of the websites or by results of search engine queries. The website information stored in the website basic data table 101 includes at least one of the following: website name, website URL, website validity (whether it can be opened), text information of website, website snapshot picture URL, website markup template code number, website creation time, and website update time, as shown in fig. 2.
The order monitoring result table 102 is used for storing relevant information of the transaction order obtained by actively detecting the illegal website, such as an order number, an order screenshot, a transaction bank, transaction time, a website URL, a transaction amount, payee account information, and the like.
The simulated registration login information table 103 is used to store website URLs and registered user information (virtual data) corresponding to simulated registration and login of an illegal website, and is convenient for reading and using next time.
Preferably, the database 1 further stores a monitoring log information table 104, which is used for storing monitoring log information generated when the illegal website is subjected to analog registration, login and detection of a transaction channel.
2. Illegal trading platform positioning module
The illegal trading platform positioning module 2 is used for screening out illegal websites through manually inputting URL and key words of the websites or through the result of search engine query.
Taking the search and the location of a certain kind of illegal websites as an example, the method mainly includes two ways: manual entry and search engine queries.
Manual input: and manually inputting illegal website URLs and key words. Wherein, the keyword is preset manually.
Search engine query: illegal websites have some characteristics, such as easy change of URL, resolution of multiple domain names to the same website, unstable accessibility of the website itself, and the like. Generally, the human resources capable of providing the suspected website are limited, so an illegal transaction platform location module is required to automatically find the suspected website and monitor the suspected website in real time, namely, query the suspected website through a search engine. The search engine queries the required keywords, so that a system administrator configures the keywords and the keyword distortion words, then queries the search engine through the keywords to obtain corresponding suspected illegal website URL, performs keyword review on the source code content in the webpage, regards the website passing the review as an illegal website, and records the website into the website basic data table 101 of the database 1.
3. Mark template module
Because the number of illegal trading platforms is huge, hundreds of illegal trading platforms may exist, and the efficiency of developing and maintaining the system is different from the great difference between the new increase of illegal network stations and the higher speed, the burden of a developer of the system is increased. Aiming at the contradiction, a distributed network system of a marking template needs to be designed, and the system can automatically select an action instance according to the accessed illegal transaction platform webpage information to complete a monitoring task.
The marked template module 3 is used for calculating the similarity between the text information of the illegal website and a pre-marked template, wherein the pre-marked template is different preset models generated by classifying historical illegal websites, and each template is marked with a template code; if the calculated similarity is greater than a predetermined threshold, classifying the illegal website, marking a new template code, and storing the template as a newly added template in the website basic data table 101.
Common algorithms for calculating the similarity of texts include a SimHash algorithm, a machine learning clustering algorithm, a method for reversely constructing XPath according to a Dom tree, a SimHash-based improved Kmeans clustering method and the like. Of course, the method for classifying websites through similarity calculation mentioned in the present application is not limited thereto, and all algorithms capable of implementing website classification through similarity calculation should be covered in the protection scope of the present application.
The similarity calculation is performed by taking the SimHash algorithm as an example.
SimHash is the most common hash method for web page deduplication, is fast, and compares the similarity between documents according to Hamming distance. The SimHash algorithm flow is shown in FIG. 3, and the algorithm process is as follows:
extracting keywords from the document Doc (including word segmentation and weight calculation), and extracting n (keyword, weight) pairs, namely (feature, weight) in the graph. Note that feature _ weight _ pairs ═ fw1, fw2.. fwn ], where fwn ═ feature _ n, weight _ n, and n is a natural number greater than 1.
hash _ weight _ pairs ═ hash (feature), weight) for feature, weight in feature _ weight _ pairs ] generates (hash, weight) in the graph, and at this time, it is assumed that the number of bits _ count generated by hash is 6 (see fig. 3).
Then the hash weight pairs is accumulated longitudinally with a bit of + weight if the bit is 1 and-weight if the bit is 0, and finally a bits count number is generated, as shown in the figure [13,108, -22, -5, -32,55], where the resulting value is related to the algorithm used for the hash function.
A positive number is represented by 1 and a negative number by 0, then [13,108, -22, -5, -32,55] is converted into a binary string 110001, i.e. the SimHash value of the document Doc.
And calculating the similarity between the two documents, namely calculating the SimHash values of the two documents respectively, and then calculating the Hamming distance between the two SimHash values.
For example, the SimHash value for document A is: a is 100111;
the SimHash value of document B is: b is 101010;
calculating the Hamming distance of two SimHash values, namely the number of 1 in binary system after A XOR B: weighting _ distance (a, B) ═ count _1(a XOR B) ═ count _1(001101) ═ 3;
after the SimHash values of all the documents are calculated, the condition that whether the document A and the document B are similar needs to be calculated is as follows: whether the Hamming distance between A and B is less than or equal to n or not can be determined according to experience.
Specifically, in a preferred embodiment, multiple illegal websites, using the same set of H5 front-end interface, can be categorized by the SimHash algorithm, and corresponding instance actions are selected for monitoring.
Assuming that the active detection code of a certain illegal website is developed, the template code of the website mark is marked as template A, and after the SimHash value of the source code docA of the current illegal website webpage is calculated, the condition that whether the doc A and the template A are similar is required to be calculated is as follows: whether the hamming distance between the SimHash values of doc A and template A is less than or equal to n or not is generally found to be 15 according to experience.
And judging that the value of n is less than or equal to 15, and judging that the current illegal website can be monitored by using active detection codes developed by template A. By marking the template, different illegal websites are classified, the development workload can be reduced, and the effect of achieving twice the result with half the effort is achieved.
In a preferred embodiment, the process of classifying illegitimate web sites using the SimHash algorithm is as follows:
1) in the processing procedure, the SimHash value of a common website is marked: automatically accessing the websites through a search engine or a manually-entered URL, manually observing front-end characteristics, selecting several websites with the same front-end characteristics and high occurrence frequency as a class, developing an automatic script, and simultaneously recording SimHash values of the developed websites for comparison;
2) and calculating the SimHash value of each website by using a SimHash processing program, comparing the SimHash value with the marked illegal websites, marking corresponding classification labels if similar websites appear, warehousing, and selecting a corresponding script program to detect the payment channel.
4. Distributed task distribution module
The illegal transaction active detection system is deployed on a plurality of servers in a distributed environment, a load balancing implementation scheme needs to be obtained, balance of task distribution in the distributed environment is guaranteed, processing efficiency is improved, and single-point faults are avoided, as shown in fig. 4.
Distributed task queue: a distributed system is that a plurality of machines and a plurality of programs process a plurality of URLs at the same time. The distributed mode can greatly improve the efficiency of the program. Composition of the distributed task queue: broker, a container that holds message queues, is typically provided by third party message queue mechanisms such as RabbitMQ, Redis, etc. Tasks, generally written in a script, acts as a producer for generating messages. Worker, the consumer, obtains the message from Broker and processes.
Distributed task nodes: taking a certain illegal website as an example, because the monitoring tasks of the illegal website are numerous and the mark template is adopted to classify tasks, a browser Docker cluster is preferentially adopted in the distributed task distribution module, and here, a Selenium Grid is used to realize page rendering and simulation operation. The monitoring system calls the Selenium Hub uniformly, a plurality of Node agent nodes are registered on the Selenium Hub, an issuing mechanism is established between the Selenium Hub and the Node agent nodes, the Selenium Hub distributes tasks, the Node agent nodes request websites to complete simulation registration, login and transaction actions, and return corresponding webpage information source codes to enter a text analysis module and an image recognition analysis module for processing, as shown in FIG. 5.
In addition, in actual work, the situation that the system IP is forbidden is often encountered, and therefore, the illegal transaction active detection system needs a lot of IPs to realize the ceaseless switching of own IP addresses, and the purpose of normal monitoring is achieved.
5. Text analysis module and image recognition analysis module
The text analysis module 5 is configured to identify and extract text information in the transaction order received by the distributed task distribution module 4, and store the text information in the order monitoring result table 101.
The image recognition and analysis module 6 is configured to recognize and extract image information in the transaction order received by the distributed task distribution module 4, and store the image information in the order monitoring result table 101.
The relevant information of the transaction order extracted by the text analysis module 5 and the image recognition analysis module 6 may include an order number, an order screenshot, a bank of the transaction, transaction time, a website URL, transaction amount, payee account information, and the like, which may be used as a criterion for determining whether the transaction behavior is legal or illegal, so as to timely and effectively find the illegal transaction order of the illegal website, and timely display the interaction interface in the background control, and report the result to the risk control business department for subsequent processing.
Fig. 7 is a schematic flow diagram of an illegal transaction active probing system.
As shown in fig. 7, the main processes of the illegal transaction active detection system of the present application are:
the illegal trading platform positioning module screens out illegal websites through manually inputting URL and key words of the websites or through the search engine query result, and stores the website information of the illegal websites into a website basic data table of a database;
the method comprises the steps that a template marking module calculates the similarity between text information of the illegal websites stored in a website basic data table and pre-marked templates, wherein the pre-marked templates are different preset models generated by classifying historical illegal websites, and each template is marked with a template code;
if the calculated similarity is larger than a preset threshold value, classifying the illegal website, marking a new template code number, and storing the template as a newly added template into a website basic data table in a database;
if the calculated similarity is less than or equal to a preset threshold value, active detection software developed by a pre-marked template can be operated to carry out the detection of simulation registration, login and transaction channels on the illegal website; monitoring log information generated in the active detection process is stored in a monitoring log information table in a database; in the active detection process, the simulated registration and login website URL and the registered user information (virtual data) are stored in a simulated registration login information table of a database for backup;
and respectively extracting the transaction orders returned by the illegal websites received in the active detection process through a text analysis module and/or an image recognition analysis module, and storing the transaction orders into an order monitoring result table of a database to be used as a basis for judging whether the transaction behaviors are legal or illegal. For abnormal transactions and high-risk merchants discovered in the monitoring process, measures such as order adjustment, risk level adjustment, transaction limitation, closing settlement, reporting to a supervision authority and the like can be taken.
In summary, the present application provides an illegal transaction active detection method and system, the system uses the address of the website of the illegal transaction platform as input, outputs the characteristic information of the illegal transaction platform and the illegal transaction order information, and through the active detection of the illegal website, the transaction risk can be effectively identified and early warned in daily monitoring, so as to realize the rapid early warning of the platform for illegal crimes, find and dispose in time as soon as possible, avoid further expansion of loss, and improve the overall prevention and control level of financial risk.
The embodiments of the present invention have been described in detail, but the embodiments are merely examples, and the present invention is not limited to the embodiments described above. Any equivalent modifications and substitutions to those skilled in the art are also within the scope of the present invention. Accordingly, equivalent changes and modifications made without departing from the spirit and scope of the present invention should be covered by the present invention.

Claims (10)

1. An active detection method for illegal transactions, comprising:
screening out illegal websites by manually inputting URL (uniform resource locator) and key words of the websites or by searching results of an engine, and storing website information of the illegal websites into a database;
calculating the similarity between the text information of the illegal websites and pre-marked templates, wherein the pre-marked templates are different preset models generated by classifying historical illegal websites, and each template is marked with a template code;
if the calculated similarity is larger than a preset threshold value, classifying the illegal website, marking a new template code number, and storing the template code number as a newly added template in a database;
if the calculated similarity is smaller than or equal to a preset threshold value, using active detection software developed by a pre-marked template to perform simulated registration, login and detection of a transaction channel on the illegal website, extracting relevant information of a transaction order returned by the illegal website through text analysis mining and/or image recognition analysis, and storing the relevant information in a database as a basis for judging whether the transaction behavior is legal or illegal.
2. The active detection method of illegal transactions according to claim 1, characterized in that: and after the illegal website is simulated and registered, storing the corresponding website URL and the registered user information into a database for backup.
3. The active detection method of illegal transactions according to claim 1, characterized in that: the website information of the illegal website screened out by manually inputting the URL and the keyword of the website or the result of the search engine query comprises at least one of the following: website name, website URL, website validity, website text information, website snapshot picture URL, website mark template code number, website creation time and website update time.
4. The active detection method of illegal transactions according to claim 1, characterized in that: the similarity is the hamming distance between the SimHash value of the webpage source code of the illegal website and the SimHash value of the pre-marked template.
5. The active illegal transaction detection method according to claim 1, wherein the information related to the transaction order comprises at least one of the following: order number, order screenshot, bank of transaction, transaction time, website URL, transaction amount, payee account information.
6. The active illegal transaction detection method according to claim 1, wherein distributed task distribution processing is supported, a browser Docker cluster is adopted, a Selenium Grid is used to realize page rendering and simulation operations, a Selenium Hub is called uniformly to distribute tasks to at least one Node proxy Node registered on the Selenium Hub, a plurality of Node proxy nodes request an illegal website to complete simulated registration, login and transaction actions, and a transaction order returned by the illegal website is received.
7. The active detection method of illegal transactions according to claim 1, characterized in that: and when the illegal website is subjected to simulated registration, login and transaction channel detection, generating monitoring log information and storing the monitoring log information into a database.
8. An active illegal transaction detection system comprising:
a database storing a website basic data table, an order monitoring result table, and a simulation registration login information table; wherein the content of the first and second substances,
the website basic data table is used for storing website information of illegal websites screened out by manually inputting URLs and key words of the websites or by results of search engine query;
the order monitoring result table is used for storing the related information of the transaction order obtained by actively detecting the illegal website;
the simulation registration login information table is used for storing simulation registration of illegal websites, corresponding website URLs and registered user information during login;
the illegal trading platform positioning module is used for screening out illegal websites by manually inputting URL and key words of the websites or by searching results of an engine;
the template marking module is used for calculating the similarity between the text information of the illegal website and a template marked in advance, wherein the template marked in advance is different preset models generated by classifying the illegal websites recorded in history, and each template is marked with a template code; if the calculated similarity is larger than a preset threshold value, classifying the illegal website, marking a new template code number, and storing the template code number as a newly added template into a website basic data table;
the distributed task distribution module is used for performing simulated registration, login and transaction channel detection on the illegal website with the similarity calculated in the marking template module being less than or equal to a preset threshold value by using active detection software developed by a pre-marked template, and receiving a transaction order returned by the illegal website;
the text analysis module is used for performing text analysis mining on the text information in the transaction order received by the distributed task distribution module and storing the text information into the order monitoring result table;
the image recognition analysis module is used for carrying out image recognition analysis on the image information in the transaction order received by the distributed task distribution module and storing the image information in the order monitoring result table.
9. The active illegal transaction detection system of claim 8 wherein said branding template module comprises:
an extraction unit for extracting pattern fingerprints in an illegal website;
the calculating unit is used for calculating the similarity between the pattern fingerprint in the illegal website extracted by the extracting unit and the template marked in advance;
and the determining unit is used for determining that the illegal website can use active detection software developed by a pre-marked template to perform simulated registration, login and detection of a transaction channel on the illegal website when the similarity calculated by the calculating unit is less than or equal to a preset threshold.
10. The active illegal transaction detection system of claim 8 wherein: the database also comprises a monitoring log information table which is used for storing monitoring log information generated when the illegal website is subjected to simulation registration, login and detection of a transaction channel.
CN202010776643.0A 2020-08-05 2020-08-05 Illegal transaction active detection method and system Active CN112199573B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010776643.0A CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010776643.0A CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Publications (2)

Publication Number Publication Date
CN112199573A true CN112199573A (en) 2021-01-08
CN112199573B CN112199573B (en) 2023-12-08

Family

ID=74006145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010776643.0A Active CN112199573B (en) 2020-08-05 2020-08-05 Illegal transaction active detection method and system

Country Status (1)

Country Link
CN (1) CN112199573B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966263A (en) * 2021-02-25 2021-06-15 中国银联股份有限公司 Target information acquisition method and device and computer readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996041488A1 (en) * 1995-06-07 1996-12-19 The Dice Company Fraud detection system for electronic networks using geographical location coordinates
CN101383820A (en) * 2008-07-07 2009-03-11 上海安融信息系统有限公司 Design and implementing method for SSL connection and data monitoring
KR20090090641A (en) * 2008-02-21 2009-08-26 주식회사 조은시큐리티 System for active security surveillance
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
CN106302438A (en) * 2016-08-11 2017-01-04 国家计算机网络与信息安全管理中心 A kind of method of actively monitoring fishing website of Behavior-based control feature by all kinds of means
CN107733969A (en) * 2017-07-25 2018-02-23 上海壹账通金融科技有限公司 Website simulation login method, device, service end and readable storage medium storing program for executing
CN107786537A (en) * 2017-09-19 2018-03-09 杭州安恒信息技术有限公司 A kind of lonely page implantation attack detection method based on internet intersection search
US10108968B1 (en) * 2014-03-05 2018-10-23 Plentyoffish Media Ulc Apparatus, method and article to facilitate automatic detection and removal of fraudulent advertising accounts in a network environment
CN110020075A (en) * 2017-10-20 2019-07-16 南京烽火软件科技有限公司 Device is excavated in illegal website automatically
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet
CN110413908A (en) * 2018-04-26 2019-11-05 维布络有限公司 The method and apparatus classified based on web site contents to uniform resource locator

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996041488A1 (en) * 1995-06-07 1996-12-19 The Dice Company Fraud detection system for electronic networks using geographical location coordinates
KR20090090641A (en) * 2008-02-21 2009-08-26 주식회사 조은시큐리티 System for active security surveillance
CN101383820A (en) * 2008-07-07 2009-03-11 上海安融信息系统有限公司 Design and implementing method for SSL connection and data monitoring
CN103685575A (en) * 2014-01-06 2014-03-26 洪高颖 Website security monitoring method based on cloud architecture
US10108968B1 (en) * 2014-03-05 2018-10-23 Plentyoffish Media Ulc Apparatus, method and article to facilitate automatic detection and removal of fraudulent advertising accounts in a network environment
CN106302438A (en) * 2016-08-11 2017-01-04 国家计算机网络与信息安全管理中心 A kind of method of actively monitoring fishing website of Behavior-based control feature by all kinds of means
CN107733969A (en) * 2017-07-25 2018-02-23 上海壹账通金融科技有限公司 Website simulation login method, device, service end and readable storage medium storing program for executing
CN107786537A (en) * 2017-09-19 2018-03-09 杭州安恒信息技术有限公司 A kind of lonely page implantation attack detection method based on internet intersection search
CN110020075A (en) * 2017-10-20 2019-07-16 南京烽火软件科技有限公司 Device is excavated in illegal website automatically
CN110413908A (en) * 2018-04-26 2019-11-05 维布络有限公司 The method and apparatus classified based on web site contents to uniform resource locator
CN110119469A (en) * 2019-05-22 2019-08-13 北京计算机技术及应用研究所 A kind of data collection and transmission and method towards darknet

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
凡友荣;杨涛;王永剑;姜国庆;: "基于URL特征检测的违法网站识别方法", 《计算机工程》, no. 3, pages 176 - 182 *
魏玉良: "基于主动探测的仿冒网站检测系统设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 2, pages 138 - 1244 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966263A (en) * 2021-02-25 2021-06-15 中国银联股份有限公司 Target information acquisition method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN112199573B (en) 2023-12-08

Similar Documents

Publication Publication Date Title
Zhang et al. Robust log-based anomaly detection on unstable log data
US20200293946A1 (en) Machine learning based incident classification and resolution
US9459950B2 (en) Leveraging user-to-tool interactions to automatically analyze defects in IT services delivery
Zhou et al. Spi: Automated identification of security patches via commits
CN110602029B (en) Method and system for identifying network attack
CN111181922A (en) Fishing link detection method and system
CN112749284A (en) Knowledge graph construction method, device, equipment and storage medium
CN110602030A (en) Network intrusion blocking method, server and computer readable medium
CN117473512B (en) Vulnerability risk assessment method based on network mapping
US11334592B2 (en) Self-orchestrated system for extraction, analysis, and presentation of entity data
De La Torre-Abaitua et al. On the application of compression-based metrics to identifying anomalous behaviour in web traffic
CN108804501B (en) Method and device for detecting effective information
CN112199573B (en) Illegal transaction active detection method and system
KR102257139B1 (en) Method and apparatus for collecting information regarding dark web
JP7470235B2 (en) Vocabulary extraction support system and vocabulary extraction support method
CN110866700A (en) Method and device for determining enterprise employee information disclosure source
CN115801455A (en) Website fingerprint-based counterfeit website detection method and device
US11822578B2 (en) Matching machine generated data entries to pattern clusters
CN113688346A (en) Illegal website identification method, device, equipment and storage medium
KR100992069B1 (en) A system for preventing exposure of personal information on the internet and the method thereof
CN112347328A (en) Network platform identification method, device, equipment and readable storage medium
KR20210083510A (en) Crime detection system through fake news decision and web monitoring and Method thereof
de la Torre-Abaitua et al. A parameter-free method for the detection of web attacks
Sun et al. Identify vulnerability fix commits automatically using hierarchical attention network
CN116150541B (en) Background system identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant