CN115688107A - Fraud-related APP detection system and method - Google Patents

Fraud-related APP detection system and method Download PDF

Info

Publication number
CN115688107A
CN115688107A CN202211692329.XA CN202211692329A CN115688107A CN 115688107 A CN115688107 A CN 115688107A CN 202211692329 A CN202211692329 A CN 202211692329A CN 115688107 A CN115688107 A CN 115688107A
Authority
CN
China
Prior art keywords
app
fraud
module
information
idf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211692329.XA
Other languages
Chinese (zh)
Other versions
CN115688107B (en
Inventor
周宇飞
马洪晓
胡铁
熊瑛
叶蕴芳
潘淼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aspire Technologies Shenzhen Ltd
Original Assignee
Aspire Technologies Shenzhen Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aspire Technologies Shenzhen Ltd filed Critical Aspire Technologies Shenzhen Ltd
Priority to CN202211692329.XA priority Critical patent/CN115688107B/en
Publication of CN115688107A publication Critical patent/CN115688107A/en
Application granted granted Critical
Publication of CN115688107B publication Critical patent/CN115688107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A system or a method for detecting a fraud-related APP (application) is used for detecting whether the APP running on a smart device is fraud-related, and comprises the following steps: anti-fraud monitoring module, anti-fraud monitoring module includes: the system comprises a characteristic data information monitoring module, a screen information monitoring module and a result output module; the characteristic data information monitoring module finds out a first-stage suspected fraud-related APP according to the android manifest information and the application name, and determines a second-stage suspected fraud-related APP by comparing and analyzing the first-stage suspected fraud-related APP with the positive version APP signature certificate of the white list; carrying out image recognition on the interface image, extracting text information, and analyzing the text information to obtain the APP fraud possibility value; the result output module outputs an APP list with high possibility of being involved in fraud. Screening suspected fraud-related APP samples by the technology of 'AnderManifest feature matching + application name similarity comparison + white list positive version APP signature certificate information filtering', extracting text information from screenshots by utilizing a webpage screenshot and an OCR technology, and judging whether a fraud-related webpage is involved or not by an algorithm.

Description

Fraud-related APP detection system and method
Technical Field
The application belongs to the technical field of computer security, and particularly relates to a fraud-related APP detection system and method.
Background
Xml is a necessary file in each android program. It is located in the root directory of the entire project and describes the exposed components (activities, services, etc.) in the package, their respective implementation classes, various data that can be processed, and the launch location. In addition to declaring Activities, contentProviders, services, and Intent Receivers in a program, properties and instrumentation can be specified.
TF-IDF, term Frequency-Inverse Document Frequency, is mainly used to estimate the importance of a word in a Document.
In recent years, fraud with APP has become one of the main criminal means of telecommunication phishing cases. Among them, the phishing APPs such as network concurrent bill-reading and fast loan are more, and especially some APPs imitating various banks and financial platforms have greater confusion and deception.
Such fraud-related APPs are usually implemented by using "third-party mobile application rapid development platform framework code + integrated H5 website domain name", and the development cost is extremely low. Meanwhile, the fraud-related APP is mainly scam through an integrated H5 website page, malicious static codes are almost absent, sensitive permission is absent, malicious behaviors such as sending short messages and reading address lists are absent, and common mobile phone malicious application detection technologies based on static codes and dynamic behavior analysis cannot effectively identify the fraud-related APP.
At present, a common method for detecting malicious applications of a mobile phone includes: the analysis method is based on a static code analysis method (such as a Chinese patent application document with the application number of '202011536663.7'), a dynamic behavior analysis method (such as a Chinese patent application document with the application number of '201310309568.7'), an analysis method based on the combination of a static code and a dynamic behavior (such as a Chinese patent application document with the application number of '201910968202.8'), and the like.
The malicious mobile application detection technology based on static code analysis has the following defects: when detecting a fraud-related APP comprising a third-party mobile application rapid development platform framework code and an integrated H5 website domain name, only the code of the third-party mobile application rapid development platform can be scanned, and the code may exist in a normal application using the same mobile application rapid development platform, so that malicious static code features of the fraud-related APP cannot be extracted, and the fraud-related APP cannot be identified and detected.
The malicious mobile application detection technology based on dynamic behavior analysis has the following defects: fraud-related APPs developed using the "third-party mobile application rapid development platform framework code + integrated H5 website domain name" technology are typically scam through H5 webpages. For example, the false loan fraud APP induces the victim to upload the personal sensitive data through the integrated false loan H5 webpage, then communicates with the victim through the integrated chat webpage, and induces the victim to forward payment through the loan with "the loan needs to pay the guarantee fee" and the like. Under the circumstance, malicious behaviors such as sending short messages and stealing an address book do not exist in the fraud-related APP, and finally the malicious mobile application detection technology based on dynamic behavior analysis cannot effectively detect the fraud-related APP.
The malicious mobile application detection technology based on the combination of static codes and dynamic behaviors has the following defects: when a fraud-related APP of 'third-party mobile application rapid development platform framework code + integrated H5 website domain name' is detected, static code characteristics and dynamic behavior characteristics cannot be extracted, and finally, the fraud-related APP cannot be effectively detected.
Disclosure of Invention
In order to solve the problems, according to the method, suspected fraud-related APP samples are screened through the technology of ' AndrodManifest feature matching ', application name similarity comparison and white list positive version APP signature certificate information filtering ', then webpage screenshots are utilized, text information is extracted from screenshots through an OCR technology, whether the webpage text is fraud-related or not is judged through an algorithm, and therefore the automatic judging capability of the fraud-related APP is achieved.
The technical scheme that this application solved above-mentioned technical problem is a wade fraud APP detecting system for whether the detection operation is applied APP and is waded the fraud on smart machine, include: anti-fraud monitoring module, anti-fraud monitoring module includes: the system comprises a characteristic data information monitoring module, a screen information monitoring module and a result output module; the characteristic data information monitoring module finds out a first-stage suspected fraud-related APP according to the android manifest information and/or the application name, compares and filters the first-stage suspected fraud-related APP with a white-list positive version APP signature certificate, and determines a second-stage suspected fraud-related APP; the screen information monitoring module captures a screen of a second-level suspected fraud-related APP, obtains an interface image of the APP operation, performs image recognition on the interface image, extracts text information, and analyzes the text information to obtain the probability high and low values of the APP fraud-related possibility; the result output module outputs an APP list with high possibility of involvement in fraud.
The technical scheme for solving the technical problems can further include an APP automatic test framework, wherein the APP runs in the APP automatic test framework, and the anti-fraud monitoring module tests more than 2 APPs according to an input test list; the first-level suspected fraud-related APP is found out by setting a keyword screening application name; comparing and filtering the first-stage suspected fraud-related APP with the genuine APP signature certificate of the white list, and determining a second-stage suspected fraud-related APP; the text information analysis algorithm comprises TF-IDF, WORD2VEC or/and BERT.
According to the technical scheme, the screen information monitoring module comprises a screen capture module and an image recognition analysis module, the screen capture module is used for recording or intercepting an APP in operation, the image recognition analysis module is used for carrying out image recognition on an obtained APP interface image, the screen capture module outputs prompt information, and the prompt information can be a jump-out window or a floating window or a fixed operation button to enable a user to manually operate the screen to intercept the APP interface image.
The technical scheme for solving the technical problems can also be that the image recognition and analysis module comprises a text information extraction module, a word segmentation module, a TF-IDF characteristic dictionary module of a phishing webpage, a TF-IDF vector calculation module and a classification machine learning module; the text information extraction module processes the information after the image recognition to obtain text information; the word segmentation module processes the text information to obtain a word group; the TF-IDF vector calculation module carries out TF-IDF vector calculation on the phrases according to the TF-IDF feature dictionary of the fraud-related webpage to obtain phrases TF-IDF vectors; and the classification machine learning module processes the obtained phrase TF-IDF vector to obtain the probability high and low values of APP fraud.
The technical scheme for solving the technical problems can also be that the TF-IDF characteristic dictionary module related to the fraud webpage updates the TF-IDF characteristic dictionary through the network server.
The technical scheme for solving the technical problems can also be that the characteristic data information monitoring module comprises a to-be-detected sample information extraction module and a white list positive version APP signature certificate characteristic comparison module.
The technical scheme for solving the technical problems can also be that the white list positive version APP signature certificate feature comparison module updates the white list digital certificate feature through a network server.
The technical solution for solving the above technical problem in this application can also be a method for detecting a fraud-related APP, which is used to detect whether an APP running on a smart device is involved in a fraud or not, including:
step 100: finding out a first-stage suspected fraud-related APP according to the android message, the application name and/or the signature certificate, and comparing and analyzing the first-stage suspected fraud-related APP and the white-list legal version APP signature certificate to determine a second-stage suspected fraud-related APP;
step 200: and running the suspected fraud-related APP at the second level, capturing a screen to obtain an interface image of the running APP, performing image recognition on the interface image, extracting text information, analyzing the text information to obtain the probability high and low value of the fraud-related APP, and outputting an APP list with high fraud-related probability.
For centralized testing, a list of APPs with high likelihood of fraudulence may be output.
In the technical solution of the present application for solving the above technical problem, the step 100 may further include:
step 110: acquiring android manifest information and/or an application name of a sample to be detected;
step 120: acquiring a sample signature certificate to be detected, wherein the signature certificate information comprises: owner, validation start time, validation end time, and/or sequence number;
step 130: determining a first-stage suspected fraud-related APP based on the android match rule feature library and the application name match rule feature library;
step 140: and comparing and filtering the positive APP signature certificates of the white list, eliminating a white list sample, and determining the suspected fraud-related APP at the second level.
The technical solution of the present application for solving the above technical problem may further include that step 200 includes:
step 210: the method comprises the steps of capturing a screen of an operating APP to obtain an interface image of the operating APP;
step 220: carrying out image recognition on the interface image, and extracting text information;
step 230: and segmenting the text information to obtain a phrase, analyzing and calculating the phrase to obtain the probability high and low values of APP involvement in fraud, wherein the analysis algorithm comprises TF-IDF, WORD2VEC or/and BERT.
Step 230 may also include;
step 231: performing TF-IDF vector calculation on the phrases according to the TF-IDF feature dictionary of the fraud-related webpage to obtain phrases TF-IDF vectors;
step 232: using a classification machine to learn, and processing the obtained phrase TF-IDF vector to obtain the probability high and low values of APP for fraudulence;
step 233: outputting an APP list with high possibility of concerning fraud.
A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described fraud-related APP detection method.
One of the technical effects of the technical scheme is as follows: primarily screening first-level suspected fraud-related APPs through android Manifest information and application names; and eliminating regular software by using a white list positive APP signature certificate to obtain a second-level suspected fraud-related APP, so that the workload of obtaining an interface image by obtaining an operation APP can be greatly reduced, and the detection work is accelerated.
The second technical effect of the technical scheme is as follows: the anti-fraud monitoring module can be directly installed in a user mobile phone or intelligent equipment through user permission or user manual operation, and the illegal suspicion of a background shooting user mobile phone interface is eliminated.
The third technical effect of the technical scheme is as follows: the image recognition and analysis module comprises a text information extraction module and a word segmentation module, so that the information of the H5 webpage is extracted, and the H5 webpage type fraud APP can be recognized and detected.
The fourth technical effect of the technical scheme is as follows: the automatic test framework can test APP in batches.
The fifth technical effect of the technical scheme is as follows: and eliminating regular software by using the white list positive APP signature certificate to obtain the second-level suspected fraud-related APP, so that the workload of obtaining the interface image of the APP operation can be greatly reduced, and the detection work is accelerated.
The sixth technical effect of the technical scheme is as follows: the TF-IDF characteristic dictionary is updated through the network server, the latest characteristic dictionary can be obtained, and the anti-fraud monitoring module can aim at the latest key vocabulary in real time.
The seventh technical effect of the technical scheme is as follows: the white list legal version APP signature certificate features are updated through the network server, and the APP of a legal financial institution can be eliminated in time.
Drawings
FIG. 1 is a schematic block diagram of a fraud-related APP detection system;
FIG. 2 is a schematic block diagram including an Apium automated testing framework;
FIG. 3 is a schematic block diagram of the anti-fraud monitoring module internal modules;
FIG. 4 is a schematic diagram of the internal modules of the screen information monitoring module;
FIG. 5 is a schematic diagram of the internal modules of the image recognition analysis module
FIG. 6 is a schematic diagram of modules within a feature data information monitoring module
FIG. 7 is a schematic flow chart of a method for detecting a fraud-related APP;
FIG. 8 is a schematic diagram illustrating a process of determining suspected fraud-related APPs of the first and second levels;
FIG. 9 is a schematic flow diagram of screen shot information monitoring analysis;
FIG. 10 is a flow chart diagram of the TF-IDF algorithm.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings.
It should be noted that the following description is made of preferred embodiments of the present invention and is not intended to limit the present invention in any way. The description of the preferred embodiments of the present invention is made merely for the purpose of illustrating the general principles of the invention. The embodiments described in this application are only some embodiments of the invention and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present application, it is to be understood that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," and the like are used in an orientation or positional relationship indicated in the drawings for convenience in describing the invention and simplicity in description, but do not indicate or imply that the device or element so referred to must have a particular orientation, be constructed in a particular orientation, and be operated in a particular orientation, and thus are not to be construed as limiting the invention. Furthermore, the terms "first", "second", and technical features numbered with Arabic numerals 1, 2, 3, etc., and such numbers as "A" and "B" are used for descriptive purposes only and are not intended to represent temporal or spatial order; are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first," "second," and numbered with an Arabic numeral 1, 2, 3, etc., may explicitly or implicitly include one or more of the features. In the description of the present invention, "a number" means two or more unless specifically limited otherwise.
As shown in fig. 1, a system for detecting a fraud-related APP, which is used to detect whether an APP running on a smart device is a fraud-related APP, includes: as shown in fig. 3, the anti-fraud monitoring module includes: the system comprises a characteristic data information monitoring module, a screen information monitoring module and a result output module;
the characteristic data information monitoring module finds out a first-stage suspected fraud-related APP according to the android manifest information and/or the application name signature certificate; comparing and analyzing the first-stage suspected fraud-related APP with the white-list positive APP signature certificate, and determining a second-stage suspected fraud-related APP; finding out the suspected fraud-related APP at the first level can be found out by setting keyword screening.
The screen information monitoring module captures a screen of a second-level suspected fraud-related APP, obtains an interface image of the APP operation, performs image recognition on the interface image, extracts text information, and analyzes the text information to obtain the probability high and low values of the APP fraud-related possibility;
the result output module outputs an APP list with high possibility of involvement in fraud.
After image recognition is carried out, after the text information of the APP running interface is obtained, whether the text information relates to the behaviors of luring users to loan and loan is judged by various methods, for example, a neural network algorithm, an artificial intelligence algorithm and the like are adopted, the calculated results of the algorithms are presented in terms of possibility, for example, 0 to 100%, and the output list is manually judged according to the high possibility, for example, higher than 80%.
Because the APP is operated and the operation interface of the APP is obtained, the required time is long, the calculated amount is large, all the APPs cannot be tested in a short time, the first-level suspected fraud-related APP is found out by using keywords first, the APP of the regular financial institution is eliminated through the white-list formal APP signature certificate, the number of the APPs which need to be identified by image identification processing is greatly reduced, and the working efficiency is greatly improved.
The anti-fraud monitoring module can be a software module embedded in the intelligent equipment and also can be an application APP installed in the later stage of the intelligent equipment, the system authority of the anti-fraud monitoring module is higher, and the information of other APPs and the operation interface of other software can be intercepted when other APPs are operated.
The android Manifest information and the application name of the sample to be detected can be obtained through the aapt tool, the signature certificate of the sample to be detected can be obtained through the Keytool tool, and the android Manifest.xml information can be obtained from the APK to be detected through an aapt dump xmltree xxx.apk android Manifest.xml command.
The system obtains application name information (application-label) from the APK to be checked through an' aapt dump bag coding xxx.
The system obtains the signature certificate information from the APK to be checked through a 'keytool-printcert-jarfile d: \ 18i6ic.apk' command, wherein the signature certificate information comprises an owner, an effect starting time, an effect ending time, a school queue number and the like.
And comparing the android Manifest information and the application name information of the sample to be detected to screen a first-level fraud-related APP sample based on the android Manifest matching rule feature library, the application name matching rule feature library and the copied enterprise APP original edition digital certificate feature library.
And the security expert combs the original APP certificate information of the common counterfeited enterprise and inputs the original APP certificate information into the 'APP original digital certificate feature library of the counterfeited enterprise' to form a white list sample.
The system compares the android Manifest information, the application name information and the signature certificate information of the sample to be detected based on the android Manifest matching rule feature library, the application name matching rule feature library and the APP original edition digital certificate feature library of the counterfeited enterprise, and screens suspected fraud-related APP samples. The android manifest information is matched by keywords, the application name filters punctuation marks/special characters first (the current phishing APP has the condition of mixing punctuation marks or special characters, such as 'Jing, east, jin, bar') and then is matched by a regular expression, and the signature certificate is matched by a serial number. If the sample to be detected is detected, the android match rule and the application name match rule are hit at the same time; and then, determining the suspected fraud APP at the second level if the corresponding signature certificate does not exist in the original digital certificate feature library of the counterfeit enterprise APP.
As shown in fig. 2, the system further includes an APP automated testing framework, where the APP runs in the APP automated testing framework, and the anti-fraud monitoring module tests more than 2 APPs according to the input testing list. Adopt the automatic test frame of APP, can carry out the automatic start-up operation to a lot of APPs, test APP in batches, this kind of mode can be used in special detection instrument of wading with the fraud. The automatic test framework can be selected for use in a variety of ways, and test software capable of automatically driving APP to run can be selected for use.
Primarily screening out first-stage suspected fraud-related APPs by the android manifest information and/or the application names; and comparing and filtering the white list positive version APP signature certificate, and removing the positive software to obtain the second-level suspected fraud-related APP, so that the workload of obtaining an interface image operated by the APP can be greatly reduced, the detection work is accelerated, and the text information analysis algorithm has multiple choices including TF-IDF, WORD2VEC or/and BERT.
As shown in fig. 4, the screen information monitoring module includes a screen capture module and an image recognition and analysis module, the screen capture module performs interface recording or capturing on the running APP, the image recognition and analysis module performs image recognition on the obtained APP interface image, and the screen capture module outputs prompt information, which can be a jump-out window or a fixed or floating control button, to enable a user to manually operate screen capturing.
If the screen is required to be captured when the screen is shot on intelligent equipment such as a mobile phone of a user, prompt information can be given to prompt the user that the screen is currently captured or a window is skipped out to enable the user to manually operate the screen capture. The anti-fraud monitoring module can be directly installed in a user mobile phone or smart device through user permission or manual operation of the user.
As shown in fig. 5, the image recognition and analysis module includes a text information extraction module, a word segmentation module, a TF-IDF feature dictionary module of a fraud-related webpage, a TF-IDF vector calculation module, and a classification machine learning module; the text information extraction module processes the information after the image recognition to obtain text information; the word segmentation module processes the text information to obtain a word group; the TF-IDF vector calculation module carries out TF-IDF vector calculation on the phrases according to the TF-IDF feature dictionary of the fraud-related webpage to obtain phrases TF-IDF vectors; and the classification machine learning module processes the obtained phrase TF-IDF vector to obtain the probability high and low values of APP involvement in fraud.
As shown in FIG. 5, the TF-IDF feature dictionary module of the fraud-related webpage updates the TF-IDF feature dictionary through the network server.
The TF-IDF feature dictionary is updated through the network server, the latest feature dictionary can be obtained, and the anti-fraud monitoring module can aim at the latest key vocabulary in real time.
As shown in fig. 6, the characteristic data information monitoring module includes a to-be-detected sample information extraction module and a white-list positive APP signature certificate characteristic comparison module.
As shown in fig. 6, the whitelist genuine APP signature certificate feature comparison module updates the whitelist digital certificate feature through the network server.
By updating the white list positive version APP signature certificate characteristics through the network server, the APP of the normal financial institution can be excluded.
As shown in fig. 7, a method for detecting a fraud-related APP, which is used to detect whether an APP running on a smart device is fraud-related, includes:
step 100: finding out first-stage suspected fraud-related APPs according to the android manifest information, the application names and/or the signature certificates, and comparing and analyzing the first-stage suspected fraud-related APPs with a white list to determine second-stage suspected fraud-related APPs;
step 200: and running the suspected fraud-related APP at the second level, capturing a screen, obtaining an interface image of the running APP, carrying out image recognition on the interface image, extracting text information, analyzing the text information, obtaining the probability high and low value of the APP concerning fraud, and outputting an APP list with high possibility of concerning fraud.
As shown in fig. 8, step 100 includes:
step 110: acquiring android manifest information and/or an application name of a sample to be detected;
step 120: acquiring a sample signature certificate to be detected, wherein the signature certificate information comprises: owner, validation start time, validation end time, and/or sequence number;
step 130: determining a first-stage suspected fraud-related APP based on the android match rule feature library and the application name match rule feature library;
step 140: and comparing and filtering the positive APP signature certificates of the white list, eliminating a white list sample, and determining the suspected fraud-related APP at the second level.
As shown in fig. 9, the step 200 includes:
step 210: the method comprises the steps of capturing a screen of an operating APP to obtain an interface image of the operating APP;
step 220: carrying out image recognition on the interface image, and extracting text information;
step 230: and segmenting the text information to obtain a phrase, analyzing and calculating the phrase to obtain the probability high and low values of APP involvement in fraud, wherein the algorithm of the analysis and calculation comprises TF-IDF, WORD2VEC or/and BERT.
Step 230 includes:
step 231: performing TF-IDF vector calculation on the phrases according to the TF-IDF characteristic dictionary of the phishing webpage to obtain a phrase TF-IDF vector;
step 232: using a classification machine to learn, and processing the obtained phrase TF-IDF vector to obtain the probability high and low values of APP for fraudulence;
step 233: outputting a list of APPs with high probability of fraud.
TF-IDF, term Frequency-Inverse Document Frequency, is mainly used to estimate the importance of a word in a Document.
Description of the symbols:
document set: d = { D1, D2, D3., dn }
nw, d: number of occurrences of word w in document d
{ wd }: set of all words in document d
nw: number of documents containing word w
In the step 231, the process proceeds to,the calculation formula of the word frequency TF is as follows
Figure 851834DEST_PATH_IMAGE001
Inverse document frequency IDF calculation formula
Figure 197364DEST_PATH_IMAGE002
The TF-IDF is calculated by the formula
Figure 550985DEST_PATH_IMAGE003
In step 232, based on the trained fraud-related webpage text classification Machine learning model (a linear SVC linear classification Support Vector Machine (SVM) supervised learning algorithm is adopted), and with screenshot text TF-IDF Vector as input, whether the sample to be detected is a fraud-related APP and the corresponding type are researched and judged.
And taking the TF-IDF vector as an input, calculating and classifying through a classified machine learning model to obtain the probability degree of the fraud-related samples, and outputting a fraud-related APP list for people larger than a set value, and manually making final judgment.
A readable storage medium having stored thereon a computer program for executing the above method by a processor.
While the invention has been illustrated and described in terms of a preferred embodiment and several alternatives, the invention is not limited by the specific description in this specification. Other additional alternative or equivalent components may also be used in the practice of the present invention.

Claims (12)

1. The utility model provides a wade in fraud APP detecting system for whether detect operation is waded in the fraud in application APP on smart machine, its characterized in that includes: anti-fraud monitoring module, anti-fraud monitoring module includes: the system comprises a characteristic data information monitoring module, a screen information monitoring module and a result output module;
the characteristic data information monitoring module finds out first-stage suspected fraud-related APPs according to android Manifest information and/or application names; comparing and filtering the first-stage suspected fraud-related APP with the genuine APP signature certificate of the white list, and determining a second-stage suspected fraud-related APP;
the screen information monitoring module captures a screen of a second-level suspected fraud-related APP, obtains an interface image of APP operation, performs image recognition on the interface image, extracts text information, and analyzes the text information to obtain the possibility high and low values of APP fraud;
the result output module outputs an APP list with high possibility of involvement in fraud.
2. The fraud-related APP detection system of claim 1, further comprising an automated testing framework in which the application APPs run, wherein the anti-fraud monitoring module tests more than 2 application APPs according to an input test list; the first-level suspected fraud-related APP is found out by setting a keyword screening application name; comparing and filtering the first-stage suspected fraud-related APP with the genuine APP signature certificate of the white list, and determining a second-stage suspected fraud-related APP; the text information analysis algorithm comprises TF-IDF, WORD2VEC or/and BERT.
3. The system according to claim 1, wherein the screen information monitoring module comprises a screen capture module and an image recognition and analysis module, the screen capture module records or captures an interface of the running APP, the image recognition and analysis module performs image recognition on the obtained APP interface image, and the screen capture module outputs prompt information to allow a user to manually operate screen capture.
4. The fraud-related APP detection system of claim 3, wherein the image recognition analysis module comprises a text information extraction module, a word segmentation module, a fraud-related webpage TF-IDF feature dictionary module, a TF-IDF vector calculation module, a classification machine learning module; the text information extraction module processes the information after the image recognition to obtain text information; the word segmentation module processes the text information to obtain a word group; the TF-IDF vector calculation module carries out TF-IDF vector calculation on the phrases according to the TF-IDF feature dictionary of the fraud-related webpage to obtain phrases TF-IDF vectors; and the classification machine learning module processes the obtained phrase TF-IDF vector to obtain the probability high and low values of APP involvement in fraud.
5. The fraud-related APP detection system of claim 4, wherein said fraud-related webpage TF-IDF feature dictionary module updates a TF-IDF feature dictionary through a network server.
6. The fraud-related APP detection system of claim 3, wherein the characteristic data information monitoring module comprises a to-be-detected sample information extraction module and a white-list positive version APP signature certificate characteristic comparison module.
7. The fraud-related APP detection system of claim 6, wherein the whitelist positive APP signature certificate feature comparison module updates whitelist positive APP signature certificate features through a network server.
8. A method for detecting a fraud-related APP (application) is used for detecting whether the APP running on intelligent equipment is a fraud-related APP, and is characterized by comprising the following steps:
step 100: finding out first-stage suspected fraud-related APPs according to android Manifest information and/or application names; comparing and filtering the first-stage suspected fraud-related APP with the white-list positive APP signature certificate, and determining a second-stage suspected fraud-related APP;
step 200: and operating the suspected fraud-related APP at the second level, capturing a screen, obtaining an interface image of the APP operation, performing image recognition on the interface image, extracting text information, and analyzing the text information to obtain the probability high-low value of the APP fraud-related.
9. The fraud-related APP detection method of claim 8, wherein said step 100 comprises:
step 110: acquiring android manifest information and/or an application name of a sample to be detected;
step 120: acquiring a sample signature certificate to be detected, wherein the signature certificate information comprises: owner, validation start time, validation end time, and/or sequence number;
step 130: determining a first-stage suspected fraud-related APP based on the android match rule feature library and the application name match rule feature library;
step 140: and comparing and filtering the positive APP signature certificates of the white list, eliminating a white list sample, and determining the suspected fraud-related APP at the second level.
10. The fraud-related APP detection method of claim 8, wherein said step 200 comprises:
step 210: the method comprises the steps of capturing a screen of an operating APP to obtain an interface image of the operating APP;
step 220: carrying out image recognition on the interface image, and extracting text information;
step 230: and segmenting the text information to obtain a phrase, analyzing and calculating the phrase to obtain the probability high and low values of APP involvement in fraud, wherein the algorithm of the analysis and calculation comprises TF-IDF, WORD2VEC or/and BERT.
11. The fraud-related APP detection method of claim 10, wherein said step 230 comprises:
step 231: performing TF-IDF vector calculation on the phrases according to the TF-IDF feature dictionary of the fraud-related webpage to obtain phrases TF-IDF vectors;
step 232: using a classification machine to learn, and processing the obtained phrase TF-IDF vector to obtain the probability high and low values of APP for fraudulence;
step 233: outputting a list of APPs with high probability of fraud.
12. A readable storage medium having stored thereon a computer program, characterized in that,
the program, when executed by a processor, implements a fraud-related APP detection method as recited in any one of claims 8 to 11.
CN202211692329.XA 2022-12-28 2022-12-28 Fraud-related APP detection system and method Active CN115688107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211692329.XA CN115688107B (en) 2022-12-28 2022-12-28 Fraud-related APP detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211692329.XA CN115688107B (en) 2022-12-28 2022-12-28 Fraud-related APP detection system and method

Publications (2)

Publication Number Publication Date
CN115688107A true CN115688107A (en) 2023-02-03
CN115688107B CN115688107B (en) 2023-04-11

Family

ID=85055081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211692329.XA Active CN115688107B (en) 2022-12-28 2022-12-28 Fraud-related APP detection system and method

Country Status (1)

Country Link
CN (1) CN115688107B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859292A (en) * 2023-02-20 2023-03-28 卓望数码技术(深圳)有限公司 Fraud-related APP detection system, judgment method and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6039768B1 (en) * 2015-08-12 2016-12-07 日本電信電話株式会社 ADJUSTMENT DEVICE, ADJUSTMENT METHOD, AND ADJUSTMENT PROGRAM
CN107169049A (en) * 2017-04-25 2017-09-15 腾讯科技(深圳)有限公司 The label information generation method and device of application
CN107871080A (en) * 2017-12-04 2018-04-03 杭州安恒信息技术有限公司 The hybrid Android malicious code detecting methods of big data and device
CN114492584A (en) * 2021-12-28 2022-05-13 南方科技大学 Automatic content grading method for android Chinese application market
CN114662033A (en) * 2022-04-06 2022-06-24 昆明信息港传媒有限责任公司 Multi-modal harmful link recognition based on text and image
CN115292674A (en) * 2022-08-08 2022-11-04 重庆邮电大学 Fraud application detection method and system based on user comment data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6039768B1 (en) * 2015-08-12 2016-12-07 日本電信電話株式会社 ADJUSTMENT DEVICE, ADJUSTMENT METHOD, AND ADJUSTMENT PROGRAM
CN107169049A (en) * 2017-04-25 2017-09-15 腾讯科技(深圳)有限公司 The label information generation method and device of application
CN107871080A (en) * 2017-12-04 2018-04-03 杭州安恒信息技术有限公司 The hybrid Android malicious code detecting methods of big data and device
CN114492584A (en) * 2021-12-28 2022-05-13 南方科技大学 Automatic content grading method for android Chinese application market
CN114662033A (en) * 2022-04-06 2022-06-24 昆明信息港传媒有限责任公司 Multi-modal harmful link recognition based on text and image
CN115292674A (en) * 2022-08-08 2022-11-04 重庆邮电大学 Fraud application detection method and system based on user comment data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859292A (en) * 2023-02-20 2023-03-28 卓望数码技术(深圳)有限公司 Fraud-related APP detection system, judgment method and storage medium

Also Published As

Publication number Publication date
CN115688107B (en) 2023-04-11

Similar Documents

Publication Publication Date Title
Pendlebury et al. {TESSERACT}: Eliminating experimental bias in malware classification across space and time
CN107203765B (en) Sensitive image detection method and device
CN113450147B (en) Product matching method, device, equipment and storage medium based on decision tree
CN112565250B (en) Website identification method, device, equipment and storage medium
CN109801151B (en) Financial falsification risk monitoring method, device, computer equipment and storage medium
CN113067820A (en) Method, device and equipment for early warning abnormal webpage and/or APP
CN115688107B (en) Fraud-related APP detection system and method
An et al. Benchmarking the robustness of image watermarks
CN111143858B (en) Data checking method and device
CN110955796A (en) Case characteristic information extraction method and device based on record information
CN113988226B (en) Data desensitization validity verification method and device, computer equipment and storage medium
CN116189215A (en) Automatic auditing method and device, electronic equipment and storage medium
CN113836297B (en) Training method and device for text emotion analysis model
CN112818150B (en) Picture content auditing method, device, equipment and medium
CN114510720A (en) Android malicious software classification method based on feature fusion and NLP technology
CN112163217B (en) Malware variant identification method, device, equipment and computer storage medium
CN112417007A (en) Data analysis method and device, electronic equipment and storage medium
CN115859292B (en) Fraud-related APP detection system, fraud-related APP judgment method and storage medium
CN113326536A (en) Method and device for judging compliance of application program
JP5500930B2 (en) Participation examination system, participation examination method, and program
CN110795705A (en) Track data processing method, device, equipment and storage medium
CN114039744B (en) Abnormal behavior prediction method and system based on user feature labels
Sharma Efficient log analysis using advanced detection and filtering techniques
CN115329330A (en) Method and system for identifying android escape software based on function call and condition characteristics
CN116704246A (en) Image identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant