CN114996708B - Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium - Google Patents

Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium Download PDF

Info

Publication number
CN114996708B
CN114996708B CN202210942003.1A CN202210942003A CN114996708B CN 114996708 B CN114996708 B CN 114996708B CN 202210942003 A CN202210942003 A CN 202210942003A CN 114996708 B CN114996708 B CN 114996708B
Authority
CN
China
Prior art keywords
application
mobile phone
fraud
actual
phone application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210942003.1A
Other languages
Chinese (zh)
Other versions
CN114996708A (en
Inventor
林美玉
常雯
周帅
王卉婷
王栩晨
王建宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Information and Communications Technology CAICT
Original Assignee
China Academy of Information and Communications Technology CAICT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Information and Communications Technology CAICT filed Critical China Academy of Information and Communications Technology CAICT
Priority to CN202210942003.1A priority Critical patent/CN114996708B/en
Publication of CN114996708A publication Critical patent/CN114996708A/en
Application granted granted Critical
Publication of CN114996708B publication Critical patent/CN114996708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud
    • H04W12/128Anti-malware arrangements, e.g. protection against SMS fraud or mobile malware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Telephone Function (AREA)

Abstract

The application provides a method and a device for researching and judging a fraud-related mobile phone application, electronic equipment and a storage medium, which relate to the technical field of mobile phone application, and the method comprises the following steps: forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged; performing simulation operation processing on the mobile phone application to be judged, recording operation screenshots corresponding to each operation, and respectively determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function; forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence; and matching the target application portrait with each application study model in the application study model library, and if the matching is successful, determining that the mobile phone application to be studied and judged is a fraud-related mobile phone application. The scheme provided by the embodiment of the application can improve the research and judgment efficiency of the application of the mobile phone.

Description

Method and device for researching and judging swindling mobile phone application, electronic equipment and storage medium
Technical Field
The application relates to the technical field of mobile phone application, in particular to a method and a device for researching and judging a fraud-related mobile phone application, an electronic device and a storage medium.
Background
With the rapid development of the mobile internet, the convenience of fraud activities using an internet platform is becoming diversified day by day, and novel fraud techniques are also coming up endlessly, and in the fraud process, an important step is to induce a victim to click a download link or scan a two-dimensional code to download a false fraud APP, and perform operations such as communication, information stealing, and fraud and transfer through the corresponding functions of the false fraud APP, so as to implement various fraud activities such as loan fraud and double-duty fraud, and seriously harm the property safety of people. There are many differences between the false fraud APP and the regular APP, such as that the website associated with the decompiled false fraud APP cannot be queried in the domain name information record management system, and further such as that the false fraud APP has a single function and a rough interface due to the relatively fixed main source code and architecture of the false fraud APP, and further such as that the false fraud APP cannot pay online, and so on.
In the prior art, a fraud APP identification method is provided, which includes the steps of obtaining an installation program file of an APP, performing feature extraction on the installation program file, performing multi-mode fusion on features of the installation program file, and identifying the APP based on the fused features and a fraud-related feature library.
The above prior art has the following disadvantages:
only feature extraction is carried out on the installation program file, the mobile phone application is not operated, only basic information can be obtained through analysis, the functions and the operation flow of the mobile phone application cannot be fully known, and the special analysis capability for the fraud-related mobile phone application is difficult to achieve.
Disclosure of Invention
The embodiment of the application provides a method and a device for studying and judging a fraud-related mobile application, an electronic device and a storage medium, which are used for solving the technical problems that the functions and the operation flow of the fraud-related mobile application cannot be fully known and the special analysis capability for the fraud-related mobile application is difficult to achieve, and improving the studying and judging efficiency of the fraud-related mobile application.
In a first aspect, an embodiment of the present application provides a method for researching and judging a fraud-related mobile application, including:
forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged;
performing simulation operation processing on the mobile phone application to be judged, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, recording an operation screenshot corresponding to each operation, and respectively determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function;
forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
and matching the target application portrait with each application study model in the application study model library, and if the matching is successful, determining that the mobile phone application to be studied and judged is a fraud-related mobile phone application.
In one embodiment, forming a target application representation based on a static analysis representation, per-operation screenshots, per-operation sequence flows, and per-operation execution sequences, comprises:
identifying application page characters and information interaction characters in each operation screenshot, and performing sensitive word analysis on the application page characters and the information interaction characters to obtain a fraud-related sensitive word analysis result; performing semantic analysis on the information interactive characters to obtain a fraud-related interactive analysis result;
removing advertisement traffic and promotion traffic in each operation sequence traffic to obtain actual operation traffic corresponding to each operation sequence traffic, and respectively determining an actual domain name and an actual URL corresponding to each operation execution sequence based on each actual operation traffic;
acquiring website record information corresponding to the mobile phone application to be researched and judged according to the actual domain name, determining an actual IP address according to the actual domain name, and determining an actual server attribution according to the actual IP address;
and composing the static analysis portrait, the fraud sensitive word analysis result, the fraud interaction analysis result, the actual domain name, the actual URL, the actual IP address, the actual server attribution and the operation execution sequence into a target application portrait.
In one embodiment, semantic analysis is performed on the information interaction text, and the semantic analysis comprises the following steps:
preprocessing the information interactive characters to obtain effective interactive contents; the preprocessing comprises the steps of carrying out word segmentation processing, symbol removing processing, nonsense word removing processing and word frequency matrix construction processing on the information interactive characters;
and performing semantic analysis on the effective interactive content through a content analysis method.
In one embodiment, forming a static analysis representation of the mobile phone application to be evaluated based on application basic information of the mobile phone application to be evaluated comprises:
decompressing the mobile phone application to be judged to obtain a resource file, and performing decompiling on a dex file in the resource file to obtain a target decompiled file;
determining application basic information based on the resource file and the target decompiled file, wherein the application basic information comprises a resource file directory, a resource file name, an application quasi-acquisition authority and a character string;
determining a file structure of the resource file based on the resource file directory and the resource file name;
determining a permission risk score according to the application planned to acquire the permission;
extracting a tracing domain name and a tracing IP address based on the character string, and determining a tracing verification result based on the tracing domain name and the tracing IP address;
carrying out binary similarity analysis on the target decompiled file through a binary analysis tool to obtain a binary analysis result;
and forming a static analysis portrait based on the application basic information, the file structure, the authority risk score, the source tracing verification result and the binary analysis result.
In one embodiment, determining application grounding information based on the resource file and the target decompiled file comprises:
determining a resource file directory and a resource file name based on the resource file;
reading a manifest.
And obtaining the character string in the target decompilation file.
In one embodiment, decompiling a dex file in a resource file includes:
the dex file is reverse compiled to a Smali or Java pseudo code file, and includes a class.
In one embodiment, after determining that the mobile phone application to be evaluated is a fraud-related mobile phone application, the method further includes:
determining a fraud type of a fraud-related mobile application;
generating alarm information based on the fraud-related mobile application and the fraud type of the fraud-related mobile application, and transmitting the alarm information into a server of a fraud handling department.
In a second aspect, an embodiment of the present application provides a fraud-related mobile application research and judgment apparatus, including:
the static analysis module is used for forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged;
the dynamic simulation module is used for performing simulation operation processing on the mobile phone application to be judged, the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, an operation screenshot corresponding to each operation is recorded, and an operation execution sequence and an operation sequence flow corresponding to each application function are respectively determined according to the operation sequence of each application function;
an application representation generation module for forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
and the studying and judging module is used for matching the target application image in each application studying and judging model in the application studying and judging model library, and if the matching is successful, the mobile phone application to be studied and judged is the fraud-related mobile phone application.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing a computer program, where the processor implements the steps of the method for appreciating a fraud-related mobile application according to the first aspect when executing the program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, which includes a computer program, and when the computer program is executed by a processor, the steps of the method for appreciating a fraud-related mobile application according to the first aspect are implemented.
According to the method, the device, the electronic equipment and the storage medium for studying and judging the fraud mobile phone application, a static analysis image of the fraud mobile phone application to be studied and judged is formed based on application basic information of the fraud mobile phone application to be simulated and operated, corresponding operation screenshots of each operation are recorded, an operation execution sequence and an operation sequence flow rate corresponding to each application function are determined according to an operation sequence of each application function, a target application image is formed based on the static analysis image, each operation screenshot, each operation sequence flow rate and each operation execution sequence, the static analysis can be conducted on the fraud mobile phone application, functions and dynamic operation flows of the mobile phone application can be fully known, the special analysis capability for the fraud mobile phone application is formed, the target application is matched with each application study and judgment model in an application study and judgment model library, and if the matching is successful, the fraud mobile phone application to be judged is determined, the active study and judgment of the mobile phone application is realized, the study and judgment efficiency of the fraud mobile phone application to be improved, and the safety information and the safety of a user of the mobile phone application are guaranteed.
Drawings
In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating a method for evaluating a fraud-related application according to an embodiment of the present application;
FIG. 2 is a second flowchart illustrating a method for evaluating a fraud-related application according to an embodiment of the present application;
FIG. 3 is a third flowchart illustrating a method for evaluating a fraud-related application according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a fraud-related application judging device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Fig. 1 is a schematic flow chart illustrating a method for judging a fraud-related application according to an embodiment of the present application. Referring to fig. 1, an embodiment of the present application provides a method for researching and judging a fraud-related mobile application, which includes:
step 101, forming a static analysis portrait of the mobile phone application to be evaluated based on the application basic information of the mobile phone application to be evaluated.
The static analysis refers to a technology for analyzing program behaviors without executing a program, and specifically refers to a code analysis technology for scanning program codes through lexical analysis, syntactic analysis, control flow and data flow analysis and other technologies to verify whether the codes meet the indexes of normalization, safety, reliability, maintainability and the like in a mode of not running the codes. In the embodiment of the application, the static analysis mainly analyzes the application basic information of the mobile phone application to be researched and judged, and the outline and the function of the mobile phone application to be researched and judged are preliminarily known, so that a static analysis portrait corresponding to the mobile phone application to be researched and judged is formed, the static analysis portrait can be understood as a label set reflecting the outline and the function of the mobile phone application to be researched and judged, and the mobile phone application to be researched and judged is imaged by utilizing the label set.
102, performing simulation operation processing on the mobile phone application to be judged, recording operation screenshots corresponding to each operation, and respectively determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function.
In the embodiment of the application, the emulation operation process is a process of traversing and operating each application function of the to-be-determined mobile phone application, that is, each application function of the to-be-determined mobile phone application is simulated and operated once, it can be understood that when a user uses the to-be-determined mobile phone application, a series of basic operations such as clicking, dragging, pressing a return key, inputting a text and the like and complex operations such as logging in, exiting, binding an account number, chatting by using a chatting robot and the like are necessarily performed, and the operations deeply traverse each function part of the to-be-determined mobile phone application, and more trigger code logic and hidden functions of the to-be-determined mobile phone application. Therefore, when each operation is performed, a dynamic screenshot corresponding to each operation, namely an operation screenshot, needs to be recorded so as to record the change condition of the interface after each operation; in addition, a corresponding operation execution sequence needs to be formed for each application function, where the operation execution sequence is composed of a plurality of operations that complete the application function according to an execution sequence, and operation traffic that needs to be consumed to complete each operation execution sequence needs to be recorded, that is, operation traffic that is needed to implement each application function is reflected. It can also be understood that, in the mobile phone application to be evaluated, there may be a case where the operation execution sequences of the fraud-related personnel and the victim user are inconsistent, and it is necessary to record the operation execution sequences of the fraud-related personnel in an emphasized manner so as to simulate the process of the fraud-related personnel operating each application function of the mobile phone application to be evaluated.
Step 103, forming a target application image based on the static analysis image, each operation screenshot, each operation sequence flow and each operation execution sequence.
After static analysis and simulation operation processing, a static analysis image of the mobile phone application to be researched and judged and detailed information generated in dynamic operation can be obtained, wherein the detailed information is each operation screenshot, each operation sequence flow and each operation execution sequence, the detailed information is further subjected to deep processing, pictures, interactive texts, flow characteristics and the like in the detailed information are analyzed, relevant key information is extracted, and the relevant key information and the static analysis image form a target application image corresponding to the mobile phone application to be researched and judged.
And step 104, matching the target application image with each application study model in the application study model library.
In the embodiment of the application, the application study model library is a model library with a large number of fraud mobile phone application images formed by analyzing functions and user behaviors of a large number of fraud mobile phone applications, and the large number of fraud mobile phone application images are classified into a plurality of application study models according to the non-use characteristics of fraud techniques, wherein the plurality of application study models comprise but are not limited to investment financing models, lottery betting models, pornography models, loan models, swipes models and other models. The formed target application portrait and the fraud mobile phone application portrait in each application study model have a uniform format, so that the target application portrait is input into each application study model and is compared with the similarity of each fraud mobile phone application portrait, if the similarity in the current application study model exceeds a preset threshold value, the target application portrait is determined to be matched with the current application study model, and the mobile phone application to be studied is determined to be a fraud mobile phone application. The fraud type of the fraud-related mobile phone application can be further determined according to the fraud type of the current application judging model, alarm information is generated based on the fraud-related mobile phone application and the fraud type of the fraud-related mobile phone application, and is transmitted to a server of a fraud handling department, so that the purpose of reporting the fraud-related mobile phone application to a supervision department or a security company is achieved, clues and evidences are provided for the supervision department, the supervision department is expected to block a download link and a server IP (Internet protocol) of the fraud-related mobile phone application, new user downloading is avoided, users who have installed the fraud-related mobile phone application are prevented from accessing the link of the fraud-related mobile phone application, and meanwhile, persuasion is carried out on the users who have installed the fraud-related mobile phone application.
The following advantageous effects can be seen from the above examples:
the method comprises the steps of forming a static analysis image of the mobile phone application to be researched and judged based on application basic information of the mobile phone application to be researched and judged, performing simulation operation processing on the mobile phone application to be researched and judged, recording operation screenshots corresponding to each operation respectively, determining an operation execution sequence and operation sequence flow corresponding to each application function respectively according to the operation sequence of each application function, and forming a target application image based on the static analysis image, each operation screenshot, each operation sequence flow and each operation execution sequence.
For ease of understanding, the following provides an embodiment of a method for judging a fraud-related application, in which a static analysis and a simulation operation are combined to generate a target application image for judgment, so as to improve the efficiency and accuracy of the judgment.
Fig. 2 is a second flowchart illustrating a method for evaluating a fraud-related application according to an embodiment of the present application. Referring to fig. 2, an embodiment of the present application provides a method for researching and judging a fraud-related mobile application, which includes:
step 201, identifying application page characters and information interaction characters in each operation screenshot, performing sensitive word analysis on the application page characters and the information interaction characters, and performing semantic analysis on the information interaction characters.
Because the detailed information generated in the dynamic operation process can be obtained after the simulation operation process, the change condition of each interface is recorded in each operation screenshot in the detailed information, the interface is bound to have explanatory characters used for propaganda and display in each application page, namely application page characters, and also has character records for information interaction with operators involved in the fraud mobile phone application through the chat robot, namely information interaction characters, which can be characters announced by the application, characters in pop-ups, characters in chat records, and the like. In the embodiment of the application, the chat robot may be obtained by pre-training or by other methods, without being limited uniquely, and is mainly used for obtaining information of a domain name, an IP server, an external link, a two-dimensional code, a transfer account number and the like of a person involved in the chat through interaction.
In the embodiment of the application, the method for identifying the application page text and the information interaction text in each operation screenshot may adopt OCR, that is, optical character recognition, or other text recognition methods, and an appropriate recognition method needs to be selected according to an actual application situation, which is not limited uniquely here.
Furthermore, sensitive word analysis is performed on the application page characters and the information interaction characters, namely, whether fraud sensitive words exist in the application page characters and the information interaction characters is judged and analyzed to obtain a fraud sensitive word analysis result, a scoring mode can be adopted for the fraud sensitive word analysis result, the scoring score is in direct proportion to the number value of the sensitive words, the sensitive words can be extracted, the expression form of the fraud sensitive word analysis result needs to be determined according to the actual application condition, and the method is not limited uniquely.
And performing semantic analysis on the information interactive words to obtain a fraud interaction analysis result, and specifically, performing preprocessing on the information interactive words to obtain effective interactive contents, wherein the preprocessing comprises word segmentation processing, symbol elimination processing, nonsense word elimination processing and word frequency matrix construction processing on the information interactive words, the word segmentation processing is to segment a complete sentence into words, the symbol elimination processing is to eliminate symbols such as punctuation marks and mathematical marks, the nonsense word elimination processing is to eliminate words without actual information transfer such as prepositions, exclamation words and word help words in the sentence, and then performing semantic analysis on the effective interactive contents by using a dictionary-based content analysis method to obtain a fraud interaction analysis result.
Therefore, the fraud-related key information can be obtained based on the fraud-related sensitive word analysis result and the fraud-related interaction analysis result.
Step 202, removing advertisement traffic and promotion traffic in each operation sequence traffic, and determining an actual domain name and an actual URL corresponding to each operation execution sequence.
It can be understood that when a certain link is clicked, many skip chains may appear, and the skip chains can implement self-skip and implement functions such as advertisement pop-up or promotion information pop-up, so that the actual operation traffic corresponding to each operation sequence traffic can be obtained only after the skip chains, that is, the advertisement traffic and the promotion traffic in each operation sequence traffic are removed, and further the actual domain name and the actual URL corresponding to each operation execution sequence are determined based on each actual operation traffic.
In this application embodiment, still can carry out statistical analysis to the actual operation flow that each operation sequence flow corresponds respectively to obtain the flow characteristic of multidimension, specifically, can use the mitmprroxy to grab the package, extract the data of each statistics dimension on the one hand, on the other hand compares flow fingerprint and blacklist storehouse, and the flow characteristic has: the method comprises the following steps of a source IP, a source port, a destination IP, a destination port, a protocol, the number of sending packets, the number of receiving packets, the number of bytes of sending packets, the number of bytes of receiving packets, DNS, TTL, HTTP headers, whether TLS exists or not, a handshake stage of whether TLS exists or not, an encryption algorithm of TLS, a certificate, a byte statistical table of 256 statistical dimensions and the like, wherein the statistical dimensions are nearly 400, and the multi-dimensional flow characteristics are mainly used for machine learning.
Step 203, acquiring website record information corresponding to the mobile phone application to be researched and judged according to the actual domain name, determining an actual IP address according to the actual domain name, and determining an actual server attribution according to the actual IP address.
The website record information includes, but is not limited to, a record home page address, a website record number, a record name, a record unit, a spare domain name of the website, registered enterprise business information, record auditing time, and the like, and is one of index data for subsequent study and judgment. The real server home is used to determine if the server is overseas, etc.
Step 204, compose the static analysis portrait, the fraud-related sensitive word analysis result, the fraud-related interaction analysis result, the actual domain name, the actual URL, the actual IP address, the actual server attribution and the operation execution sequence into a target application portrait.
In the embodiment of the present application, the target application image at least comprises the static analysis image, the fraud-related sensitive word analysis result, the fraud-related interaction analysis result, the actual domain name, the actual URL, the actual IP address, the actual server attribution and the operation execution sequence.
The following advantageous effects can be seen from the above examples:
the method comprises the steps of identifying application page characters and information interactive characters in each operation screenshot, conducting sensitive word analysis on the application page characters and the information interactive characters, conducting semantic analysis on the information interactive characters, removing advertisement traffic and promotion traffic in traffic of each operation sequence, determining an actual domain name and an actual URL corresponding to each operation execution sequence, obtaining website record information corresponding to a mobile phone application to be researched and judged according to the actual domain name, determining an actual IP address according to the actual domain name, determining an actual server attribution according to the actual IP address, and forming a target application portrait by a static analysis portrait, an actual sensitive word involved analysis result, a fraud interaction analysis result, an actual domain name, an actual URL, an actual IP address, an actual server attribution and the operation execution sequences, so that the target application portrait is automatically generated, human resource investment of analysts is reduced, analysis difficulty is reduced, and research and judgment efficiency of the fraud mobile phone application is improved.
For easy understanding, an embodiment of a method for researching and judging a mobile phone application is provided below for explanation, and in practical application, a mobile phone application to be researched and judged is decompressed first, a resource file obtained by decompression is subjected to decompilation processing, and application basic information is formed, so that a static analysis portrait is formed according to the application basic information.
Fig. 3 is a third flowchart illustrating a method for evaluating a fraud-related application according to an embodiment of the present application. Referring to fig. 3, an embodiment of the present application provides a method for researching and judging a fraud-related mobile application, which includes:
step 301, decompressing the mobile phone application to be researched and judged to obtain a resource file, and performing decompiling on a dex file in the resource file to obtain a target decompiled file.
Specifically, the dex file is reverse-compiled to a Smali or Java pseudo code file, the dex file includes a class. It is understood that the Java part is a dex file, and is decompiled to a smali file or a Java file, the C/C + + part is a so file, and is decompiled to an assembly language file, and pseudo code of C language can be generated using IDA.
Step 302, determining application basic information based on the resource file and the target decompilated file.
The application basic information comprises a resource file directory, a resource file name, an application quasi-acquisition authority and a character string, and can also comprise information such as a hash value, an application name, an installation package name, version information, a file size, a certificate and an application icon.
The resource file directory and the resource file names are determined based on the resource files, and it can be understood that the resource files are all files after decompression of a software package of a mobile phone application to be judged, and the file directory of the resource files and the names of the resource files can be visually determined. In addition, reading the manifest. In addition, in the embodiment of the present application, the character string refers to a code character string in the target decompilated file, so that the character string can be obtained in the target decompilated file. Thereby determining application grounding information.
Step 303, determining the file structure of the resource file based on the resource file directory and the resource file name.
The file composition architecture of the resource file, i.e., the file structure, can be specified based on the resource file directory and the resource file name. The purpose of determining the file structure is that most of the fraud-related mobile phone applications are generated in batch by adopting a relatively simple and mature technical means such as an H5 technology, so that the file structures of the fraud-related mobile phone applications are basically the same, i.e., the resource file directories and the resource file names are very similar. Similarly, it can be understood that the code structure of most fraud-related applications is also basically the same, and therefore, the analysis code structure is also capable of assisting in determining whether the handset application to be determined is a fraud-related application, and the code structure is the file structure of the target decompiled file.
And step 304, determining the authority risk score according to the application planned acquisition authority.
In the embodiment of the application, the permission to be acquired by the application may include a hardware access permission and a system access permission, where the hardware access permission is specifically an access permission of hardware such as a microphone, a camera, a WiFi module, a GPS module, an NFC module, and a bluetooth module, and the system access permission is specifically an access permission of system locations such as an address book, information, an album, and a file system. Whether sensitive authorities such as hardware access authority and system access authority need to be acquired or not can be determined according to the authority to be acquired by the application, whether unnecessary authorities need to be acquired or not can be determined, for example, the authority of calendar software for acquiring an address book and a camera is unreasonable, so that scoring can be performed according to the authority acquired by the application of the mobile phone to be judged, and the method can comprise the steps of firstly analyzing a single authority, then performing combined analysis on a plurality of authorities and finally performing overall evaluation. So that the privilege risk can be defined as low risk, medium risk or high risk depending on the scoring result. It should be understood that the above evaluation and analysis manner for the permissions and the defining manner for the risk of the permissions are merely exemplary, and in practical applications, an appropriate manner needs to be selected according to the practical application situation, which is not limited herein.
Step 305, extracting a tracing domain name and a tracing IP address based on the character string, and determining a tracing verification result based on the tracing domain name and the tracing IP address.
The source-tracing domain name and the source-tracing IP address obtained based on the character string extraction can be regarded as a domain name and an IP address fixed by the mobile phone application to be researched and judged, but the source-tracing domain name and the source-tracing IP address may be expired or be faked by a fraudster, so that the source-tracing domain name and the source-tracing IP address need to be verified according to the record information of the department of industry and trust to determine whether the current source-tracing domain name and the source-tracing IP address are normal or not, analysis and verification can be performed from record time, update time, effective time, legal person and other dimensions, and unique limitation is not performed.
And step 306, carrying out binary similarity analysis on the target decompiled file through a binary analysis tool to obtain a binary analysis result.
In the embodiment of the application, radare2 can be used as a binary analysis tool to perform binary similarity analysis, and whether malicious codes exist and which software development toolkit is used and other information can be reflected in an obtained binary analysis result.
And 307, forming a static analysis portrait based on the application basic information, the file structure, the authority risk score, the tracing verification result and the binary analysis result.
In the embodiment of the present application, the static analysis sketch at least includes the components of application basic information, a file structure, an authority risk score, a source tracing verification result, and a binary analysis result.
The following advantageous effects can be seen from the above examples:
the method comprises the steps of obtaining a resource file by decompressing a mobile phone application to be judged, performing decompiling on a dex file in the resource file to obtain a target decompiled file, determining application basic information based on the resource file and the target decompiled file, determining a file structure of the resource file based on a resource file directory and a resource file name, determining authority risk score according to application planned acquisition authority, extracting a tracing domain name and a tracing IP address based on a character string, determining a tracing verification result based on the tracing domain name and the tracing IP address, performing binary similarity analysis on the target decompiled file through a binary analysis tool to obtain a binary analysis result, and forming static analysis portrait based on the application basic information, the file structure, the authority risk score, the tracing verification result and the binary analysis result. The method can reflect the general situation and the function of the mobile phone application to be researched and judged, and the mobile phone application to be researched and judged is imaged by utilizing the label set, so that the general situation of the mobile phone application to be researched and judged can be fully known, and the method is favorable for improving the research and judgment efficiency and the research and judgment accuracy.
The fraud-related application research and judgment device provided in the embodiments of the present application is described below, and the fraud-related application research and judgment device described below and the fraud-related application research and judgment method described above may be referred to correspondingly.
FIG. 4 is a schematic structural diagram of a fraud-related application judging device according to an embodiment of the present application. Referring to fig. 4, an embodiment of the present application provides a device for investigating a fraud-related mobile application, which includes:
the static analysis module 410 is used for forming a static analysis portrait of the mobile phone application to be researched based on the application basic information of the mobile phone application to be researched;
the dynamic simulation module 420 is configured to perform simulation operation processing on the mobile phone application to be evaluated, where the simulation operation processing is a processing procedure of traversing each application function of the mobile phone application to be evaluated, record an operation screenshot corresponding to each operation, and determine an operation execution sequence and an operation sequence flow corresponding to each application function according to an operation sequence of each application function;
an application representation generation module 430 for forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow, and each operation execution sequence;
the judging module 440 is configured to match the target application representation with each application judging model in the application judging model library, and if the matching is successful, determine that the mobile phone application to be judged is a fraud-related mobile phone application.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor) 510, a Communication Interface (Communication Interface) 520, a memory (memory) 530 and a Communication bus 540, wherein the processor 510, the Communication Interface 520 and the memory 530 are communicated with each other via the Communication bus 540. The processor 510 may call the computer program in the memory 530 to execute the steps of the method for judging a fraud-related application, for example, including:
forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged;
performing simulation operation processing on the mobile phone application to be judged, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, recording an operation screenshot corresponding to each operation, and determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function;
forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
and matching the target application portrait with each application study model in the application study model library, and if the matching is successful, determining that the mobile phone application to be studied and judged is a fraud-related mobile phone application.
Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solutions of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium includes a computer program, where the computer program is storable on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, a computer is capable of executing the steps of the method for training a fraud-related application provided in the foregoing embodiments, for example, the method includes:
forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged;
performing simulation operation processing on the mobile phone application to be judged, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, recording an operation screenshot corresponding to each operation, and respectively determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function;
forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
and matching the target application portrait with each application studying and judging model in the application studying and judging model library, and if the matching is successful, determining the mobile phone application to be studied and judged as a mobile phone application related to fraud.
On the other hand, embodiments of the present application further provide a processor-readable storage medium, where the processor-readable storage medium stores a computer program, where the computer program is configured to cause a processor to perform the steps of the method provided in each of the above embodiments, for example, including:
forming a static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged;
performing simulation operation processing on the mobile phone application to be judged, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, recording an operation screenshot corresponding to each operation, and determining an operation execution sequence and an operation sequence flow corresponding to each application function according to the operation sequence of each application function;
forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
and matching the target application portrait with each application studying and judging model in the application studying and judging model library, and if the matching is successful, determining the mobile phone application to be studied and judged as a mobile phone application related to fraud.
The processor-readable storage medium can be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memories (NAND FLASH), solid State Disks (SSDs)), etc.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present application.

Claims (9)

1. A method for studying and judging a fraud-related mobile phone application is characterized by comprising the following steps:
forming a static analysis portrait of the mobile phone application to be researched and judged based on application basic information of the mobile phone application to be researched and judged;
performing simulation operation processing on the mobile phone application to be judged, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be judged, recording an operation screenshot corresponding to each operation, and determining an operation execution sequence and an operation sequence flow corresponding to each application function according to an operation sequence of each application function;
forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow and each operation execution sequence;
matching the target application portrait with each application study model in an application study model library, and if the matching is successful, determining that the mobile phone application to be studied and judged is a fraud-related mobile phone application;
wherein said forming a target application representation based on said static analysis representation, each operation screenshot, each operation sequence flow, and each operation execution sequence comprises:
identifying application page characters and information interaction characters in each operation screenshot, and performing sensitive word analysis on the application page characters and the information interaction characters to obtain a fraud-related sensitive word analysis result; performing semantic analysis on the information interactive characters to obtain a fraud-related interactive analysis result;
removing advertisement traffic and promotion traffic in each operation sequence traffic to obtain actual operation traffic corresponding to each operation sequence traffic, and respectively determining an actual domain name and an actual URL corresponding to each operation execution sequence based on each actual operation traffic;
performing statistical analysis on actual operation flows corresponding to the flows of each operation sequence respectively to obtain multidimensional flow characteristics, wherein the multidimensional flow characteristics are used for machine learning;
acquiring website record information corresponding to the mobile phone application to be researched and judged according to the actual domain name, determining an actual IP address according to the actual domain name, and determining an actual server home location according to the actual IP address;
composing the static analysis sketch, the fraud-related sensitive word analysis result, the fraud-related interaction analysis result, the actual domain name, the actual URL, the actual IP address, the actual server home location, and the operation execution sequence into the target application sketch.
2. The method as recited in claim 1,
the semantic analysis of the information interaction words comprises the following steps:
preprocessing the information interactive characters to obtain effective interactive contents; the preprocessing comprises word segmentation processing, symbol removing processing, nonsense word removing processing and word frequency matrix construction processing on the information interactive words;
and performing semantic analysis on the effective interactive content through a content analysis method.
3. The method as recited in claim 1,
the method for forming the static analysis portrait of the mobile phone application to be researched and judged based on the application basic information of the mobile phone application to be researched and judged comprises the following steps:
decompressing the mobile phone application to be researched and judged to obtain a resource file, and performing decompiling on a dex file in the resource file to obtain a target decompiled file;
determining the application basic information based on the resource file and the target decompiled file, wherein the application basic information comprises a resource file directory, a resource file name, an application quasi-acquisition authority and a character string;
determining a file structure of the resource file based on the resource file directory and the resource file name;
determining an authority risk score according to the application planned to acquire the authority;
extracting a source tracing domain name and a source tracing IP address based on the character string, and determining a source tracing verification result based on the source tracing domain name and the source tracing IP address;
performing binary similarity analysis on the target decompiled file through a binary analysis tool to obtain a binary analysis result;
and forming the static analysis portrait based on the application basic information, the file structure, the authority risk score, the traceability verification result and the binary analysis result.
4. The method as recited in claim 3, wherein said at least one fraud-related application is executed,
the determining the application base information based on the resource file and the target decompiled file includes:
determining the resource file directory and the resource file name based on the resource file;
reading a manifest.xml file in the target decompilation file to determine the application to-be-acquired permission;
and acquiring the character string from the target decompiled file.
5. The method as recited in claim 3, wherein the fraud-related application is executed by the server,
the decompiling the dex file in the resource file comprises:
and inversely compiling the dex file into a Smali or Java pseudo code file, wherein the dex file comprises a class.
6. The method as recited in claim 1,
after the determination that the mobile phone application to be judged is a fraud-related mobile phone application, the method further includes:
determining a fraud type of the fraud-related mobile application;
generating alarm information based on the fraud-related mobile application and the fraud type of the fraud-related mobile application, and transmitting the alarm information to a server of a fraud handling department.
7. A device for studying and judging the application of a hand-shaking machine is characterized by comprising:
the static analysis module is used for forming a static analysis portrait of the mobile phone application to be researched based on application basic information of the mobile phone application to be researched;
the dynamic simulation module is used for performing simulation operation processing on the mobile phone application to be evaluated, wherein the simulation operation processing is a processing process of traversing each application function of the mobile phone application to be evaluated, recording operation screenshots corresponding to each operation, and respectively determining an operation execution sequence and an operation sequence flow corresponding to each application function according to an operation sequence of each application function;
an application representation generation module for forming a target application representation based on the static analysis representation, each operation screenshot, each operation sequence flow, and each operation execution sequence;
the judging module is used for matching the target application image with each application judging model in an application judging model library, and if the matching is successful, the mobile phone application to be judged is determined to be a fraud-related mobile phone application;
wherein the application representation generation module is to:
identifying application page characters and information interaction characters in each operation screenshot, and performing sensitive word analysis on the application page characters and the information interaction characters to obtain a fraud-related sensitive word analysis result; performing semantic analysis on the information interactive characters to obtain a fraud-related interactive analysis result;
removing advertisement traffic and promotion traffic in each operation sequence traffic to obtain actual operation traffic corresponding to each operation sequence traffic, and respectively determining an actual domain name and an actual URL corresponding to each operation execution sequence based on each actual operation traffic;
performing statistical analysis on actual operation flows corresponding to the flows of each operation sequence respectively to obtain multi-dimensional flow characteristics, wherein the multi-dimensional flow characteristics are used for machine learning;
acquiring website record information corresponding to the mobile phone application to be researched and judged according to the actual domain name, determining an actual IP address according to the actual domain name, and determining an actual server attribution according to the actual IP address;
composing the static analysis sketch, the fraud-related sensitive word analysis result, the fraud-related interaction analysis result, the actual domain name, the actual URL, the actual IP address, the actual server home location, and the operation execution sequence into the target application sketch.
8. An electronic device comprising a processor and a memory storing a computer program, wherein the processor implements the steps of the method for appreciating a fraud-related application of any one of claims 1 to 6 when executing the computer program.
9. A computer readable storage medium comprising a computer program, wherein the computer program when executed by a processor implements the steps of the method for fraud-related application curation as claimed in any one of claims 1 to 6.
CN202210942003.1A 2022-08-08 2022-08-08 Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium Active CN114996708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210942003.1A CN114996708B (en) 2022-08-08 2022-08-08 Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210942003.1A CN114996708B (en) 2022-08-08 2022-08-08 Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114996708A CN114996708A (en) 2022-09-02
CN114996708B true CN114996708B (en) 2022-12-20

Family

ID=83022851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210942003.1A Active CN114996708B (en) 2022-08-08 2022-08-08 Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114996708B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859292B (en) * 2023-02-20 2023-05-09 卓望数码技术(深圳)有限公司 Fraud-related APP detection system, fraud-related APP judgment method and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182688A (en) * 2014-08-26 2014-12-03 北京软安科技有限公司 Android malicious code detection device and method based on dynamic activation and behavior monitoring
CN107180192B (en) * 2017-05-09 2020-05-29 北京理工大学 Android malicious application detection method and system based on multi-feature fusion
CN108595953B (en) * 2018-04-04 2020-05-19 东莞新辰智联科技有限公司 Method for carrying out risk assessment on mobile phone application
US20200175421A1 (en) * 2018-11-29 2020-06-04 Sap Se Machine learning methods for detection of fraud-related events
WO2021053647A1 (en) * 2019-09-21 2021-03-25 Cashshield Pte. Ltd. Detection of use of malicious tools on mobile devices
CN110795732A (en) * 2019-10-10 2020-02-14 南京航空航天大学 SVM-based dynamic and static combination detection method for malicious codes of Android mobile network terminal
CN110795734B (en) * 2019-10-12 2022-06-10 南京信息职业技术学院 Malicious mobile application detection method
CN114398673A (en) * 2021-12-31 2022-04-26 深圳市欢太科技有限公司 Application compliance detection method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN114996708A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN104766014B (en) Method and system for detecting malicious website
JP6609047B2 (en) Method and device for application information risk management
US10721271B2 (en) System and method for detecting phishing web pages
CN107341399B (en) Method and device for evaluating security of code file
US20150244737A1 (en) Detecting malicious advertisements using source code analysis
JP5358549B2 (en) Protection target information masking apparatus, protection target information masking method, and protection target information masking program
US10440050B1 (en) Identifying sensitive data on computer networks
CN110046494B (en) Big data processing method and system based on terminal
CN109547426B (en) Service response method and server
CN110071924B (en) Big data analysis method and system based on terminal
CN112685771A (en) Log desensitization method, device, equipment and storage medium
CN107302586A (en) A kind of Webshell detection methods and device, computer installation, readable storage medium storing program for executing
CN112182614A (en) Dynamic Web application protection system
CN106030527B (en) By the system and method for application notification user available for download
CN114996708B (en) Method and device for studying and judging fraud-related mobile phone application, electronic equipment and storage medium
CN104080058A (en) Information processing method and device
CN116340939A (en) Webshell detection method, device, equipment and storage medium
CN107018152A (en) Message block method, device and electronic equipment
CN111222181B (en) AI model supervision method, system, server and storage medium
CN112600864A (en) Verification code verification method, device, server and medium
US9584537B2 (en) System and method for detecting mobile cyber incident
CN117252429A (en) Risk user identification method and device, storage medium and electronic equipment
CN113971283A (en) Malicious application program detection method and device based on features
CN116932381A (en) Automatic evaluation method for security risk of applet and related equipment
CN111695113B (en) Terminal software installation compliance detection method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant