CN110213234B - Application program file developer identification method, device, equipment and storage medium - Google Patents

Application program file developer identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN110213234B
CN110213234B CN201910365066.3A CN201910365066A CN110213234B CN 110213234 B CN110213234 B CN 110213234B CN 201910365066 A CN201910365066 A CN 201910365066A CN 110213234 B CN110213234 B CN 110213234B
Authority
CN
China
Prior art keywords
application program
program file
information
developer
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910365066.3A
Other languages
Chinese (zh)
Other versions
CN110213234A (en
Inventor
刘健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Tencent Computer Systems Co Ltd
Priority to CN201910365066.3A priority Critical patent/CN110213234B/en
Publication of CN110213234A publication Critical patent/CN110213234A/en
Application granted granted Critical
Publication of CN110213234B publication Critical patent/CN110213234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0884Network architectures or network communication protocols for network security for authentication of entities by delegation of authentication, e.g. a proxy authenticates an entity to be authenticated on behalf of this entity vis-à-vis an authentication entity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources

Abstract

The application discloses a developer identification method, device, equipment and storage medium of an application program file, and belongs to the technical field of internet. The method comprises the following steps: acquiring an application program file to be identified; inquiring developer information matched with the application program file in a first database, wherein the first database is used for storing the corresponding relation between the application program file and the developer information; when developer information matched with the application program file is not inquired, the application program file is operated; simulating the application program file in the running state operated by a user, and acquiring network flow information generated in the running process of the application program file; and acquiring developer information of the application program file based on the network flow information. The application program file identification method and the application program file identification device assist in identifying developers of the application program files by analyzing the network flow information, and the identification mode is accurate.

Description

Application program file developer identification method, device, equipment and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, a device, and a storage medium for identifying a developer of an application file.
Background
With the popularization of intelligent mobile terminals and the rapid development of mobile network technologies, various application program files in the application market are in endless, which brings great convenience to daily work and life of people. However, while bringing great convenience to people, there are many potential safety hazards. For example, some lawbreakers may damage the rights and interests of users by making application files to provide illegal services such as pornography and gambling, or performing illegal collection on investment and financing names, or stealing short message content, and because such malicious application files may cause great risks to the information security and property security of users, it is necessary to identify developers of the application files to purify the application market.
When identifying developers of application program files, the related technology mainly adopts the following two ways: first, developer identification is performed by means of manual collection. Second, developer identification is aided based on unsolicited certificates. For example, some application markets require an application submitter to provide a license to prove its identity when receiving a listing of an application.
With respect to the first method, it is not only time and labor consuming, but also there are situations where developer information is missing or wrong due to incomplete or inaccurate manual collection. For the second mode, on one hand, a malicious attacker can imitate a certificate or steal the certificates of other developers, and on the other hand, some certificates can be generated by developers themselves and are not authenticated by an authoritative digital certificate signing authority, so that the accuracy of the acquired developer information cannot be ensured by the mode, and even the situation that the developer information cannot be acquired through the certificate exists.
Disclosure of Invention
The embodiment of the application provides a developer identification method, device, equipment and storage medium of an application program file, and solves the problems of developer information loss or error, inaccurate acquired developer information and the like in the related technology. The technical scheme is as follows:
in one aspect, a method for identifying a developer of an application file is provided, the method including:
acquiring an application program file to be identified;
inquiring developer information matched with the application program file in a first database, wherein the first database is used for storing the corresponding relation between the application program file and the developer information;
When developer information matched with the application program file is not inquired, the application program file is operated;
simulating the application program file in the running state operated by a user, and acquiring network flow information generated in the running process of the application program file;
and acquiring developer information of the application program file based on the network flow information.
In another aspect, there is provided an apparatus for identifying a developer of an application file, the apparatus including:
the first acquisition module is used for acquiring an application program file to be identified;
the query module is used for querying developer information matched with the application program file in a first database, and the first database is used for storing the corresponding relation between the application program file and the developer information;
the first processing module is used for running the application program file when developer information matched with the application program file is not inquired;
the first processing module is further used for simulating the user to operate the application program file in the running state;
the second acquisition module is used for acquiring network flow information generated in the running process of the application program file;
And the identification module is used for acquiring the developer information of the application program file based on the network flow information.
In a possible implementation manner, the second obtaining module is further configured to obtain network access requests generated by the application program file in the running process until the number of the collected network access requests reaches a preset threshold; and taking the network access requests with the number reaching the preset threshold value as the network flow information.
In a possible implementation manner, the second obtaining module is further configured to obtain, in an operation process of the application program file, a network access request generated by the application program file within a preset time duration;
and taking the network access request generated in the preset time length as the network flow information.
In a possible implementation manner, the identification module is further configured to obtain domain name information of each network access request in the network traffic information; for each network access request, inquiring website operator information matched with the domain name information of the network access request in a second database, wherein the second database is used for storing the corresponding relation between the domain name and the website operator; counting the occurrence times of the same website operator information in the acquired website operator information; and taking the website operator information with the occurrence frequency meeting the preset condition as the developer information of the application program file to be identified.
In a possible implementation manner, the identification module is further configured to use website operator information with the first occurrence number as developer information of the application file; or the website operator information with the number of occurrences ranked in the top N is used as the developer information of the application program file, and the value of N is a positive integer.
In one possible implementation, the apparatus further includes:
the storage module is used for establishing a corresponding relation between the application program file and the network access request, corresponding access time, corresponding domain name information and corresponding website operator information for each network access request; and storing the established corresponding relation to a third database.
In one possible implementation, the apparatus further includes:
and the second processing module is used for aggregating the application program files related to the developers and executing removal processing on the application program files related to the developers when the developers of the application program files are malicious developers.
In another aspect, an apparatus for identifying a developer of an application file is provided, where the apparatus includes a processor and a memory, and the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the method for identifying a developer of an application file.
In another aspect, a storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the above-mentioned developer identification method for an application file.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
after the application program file to be identified is obtained, the application embodiment may first query developer information matched with the application program file to be identified in a first database, and when the developer information matched with the application program file is not queried in the first database, the application program file may be run and a user may be simulated to operate the application program file in a running state, and then network traffic information generated in the running process of the application program file is obtained, and then the developer information of the application program file is obtained based on the network traffic information. That is, the embodiment of the application can avoid the situation that the developer information is missing or wrong due to incomplete or inaccurate manual collection, and the situation that the acquired developer information is inaccurate or even the developer information cannot be acquired.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic diagram of an implementation environment related to a developer identification method for an application file according to an embodiment of the present application;
FIG. 2 is a flowchart of a developer identification method for application files according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating a developer identification process of an application file according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a display interface provided by an embodiment of the present application;
FIG. 5 is a schematic view of another display interface provided by an embodiment of the application;
FIG. 6 is a schematic diagram of another display interface provided by an embodiment of the application;
fig. 7 is a schematic structural diagram of an apparatus for identifying a developer of an application file according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a developer identifying device for an application file according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.
Before explaining the embodiments of the present application in detail, some terms related to the embodiments of the present application will be explained.
Application program files: in the embodiment of the present application, the application file refers to a software installation package, and is a collection of files that can be decompressed by itself, including all files of the software installation. When the software installation package is operated, all files of the software can be released to a storage medium, and the work of modifying a registry, modifying system setting, creating shortcuts and the like is completed. Application files are typically captured by an application marketplace or application store and provided to the user.
As an example, the application file may be an APK (Android Package) file, and a terminal installed with an Android operating system needs to download the APK file from an application market, an application store, or the like when installing application software.
The developer: in the embodiment of the present application, a developer refers to a developer of an application file, and may also be referred to as an author, an owner, or an operator of the application file, which is not specifically limited in the embodiment of the present application.
The application market is as follows: also called application stores or application malls, which incorporate a large amount of various application resources for users to download, which is a download management platform for application resources.
The following describes an implementation environment related to a method for identifying a developer of an application file provided in an embodiment of the present application.
The developer identification method for application files provided by the embodiment of the present application is applied to a developer identification system shown in fig. 1, also referred to as a developer identification device, and referring to fig. 1, the developer identification system includes: a simulator 101, a network agent 102, an application file developer library 103, an ICP (Internet Content Provider) docket library 104, a network access request record library 105, and the Internet 106.
The simulator 101, the network agent 102, the application file developer library 103, the ICP docket library 104, and the network access request record library 105 may be configured on the same device, or may be configured on different devices, that is, the developer identification system may be composed of one or more devices, which is not specifically limited in this embodiment of the present application. For example, the ICP docket library 104 is from a remotely connected server.
In addition, the device is a computer device with computing capability, and the computer device may be a fixed computer device such as a personal computer and a server, or a mobile computer device such as a tablet computer and a smart phone, which is not limited in this embodiment of the present application.
In the embodiment of the present application, the simulator 101 is a virtual device that can run on a computer device, and the simulator 101 can preview, develop and test application files without using a physical device. Taking the android operating environment as an example, the simulator 101 is an android simulator, which can preview, develop and test an android application without using an android device.
The network proxy 102 is a transfer station of network information, and can receive a network access request from a client, and then obtain the network information from the internet 106 and return the network information to the client that initiated the network access request.
In the embodiment of the present application, as shown in fig. 1, the network agent 102 receives a network access request from the simulator 101. Based on this feature of the network 102, the network proxy 102 can monitor all network access requests generated during the operation of the application files. As an example, the network proxy 102 may record the monitored network access request and store it in the network access request record repository 105. In another expression, the network access request record base 105 is used to store a URL (Uniform Resource Locator) record that the application file accesses during the operation process.
In one possible implementation, the data record in the network access request record repository 105 may be as shown in table 1 below:
TABLE 1
Figure BDA0002047907590000061
Table 1 above shows the correspondence between the name of the application file, the access time, the access URL of the network access request, the domain name, and the website operator. It should be noted that table 1 only shows a possible network access request recording manner, where the network access request record library 105 may further record more or less corresponding relationships than the corresponding relationships among the 5 entries shown in table 1, which is not specifically limited in this embodiment of the present application.
The ICP record library 104 is a database for storing ICP record information of a website. In one possible implementation, the ICP docket information stored by the ICP docket library 104 may be as shown in table 2 below.
TABLE 2
Domain name/website Name of host unit Record number Name of website
M.com Shenzhen city M Limited Yue B2-x-5 M net
N.com Zhejiang N Co Ltd Zhe B2-1 N net
S.com Beijing S Ltd Jing ICP pattern 1 S net
…… …… …… ……
It should be noted that table 2 only shows a possible ICP record information recording manner, where the ICP record library 104 may also record more or less corresponding relationships than the corresponding relationships among the 4 entries shown in table 2, and this is not specifically limited in this embodiment of the application.
In the embodiment of the present application, the application file developer library 103 is a database for storing application developer information. As an example, taking an android operating environment as an example, the application file developer library 103 may be referred to as an APK developer library.
The developer information stored in the application file developer library 103 may be as shown in table 3 below.
TABLE 3
Application program file Developer information
a Company A
b Company B
…… ……
It should be noted that table 3 only shows a possible developer information recording manner, where the application file developer library 103 may further record more corresponding relationships than the corresponding relationships among the 2 entries shown in table 3, which is not specifically limited in this embodiment of the application.
Because the related art mainly relies on manual collection when identifying the developer information of the application program files or assisted identification according to certificates actively submitted to the application market when the applications are submitted, for the manual collection mode, the manual collection may have the situation of incomplete collection or inaccurate collection, which may result in missing or errors of the developer information of a large amount of application program files in the application market. Aiming at the certificate auxiliary identification mode, when some application markets receive and apply, the identity of the application markets can be proved by requiring developers to provide auxiliary qualifications such as company business licenses, but one problem exists here is that an attacker can copy or steal the business licenses to submit the application, and the application markets can not ensure the complete accuracy and waste time and labor when the auxiliary qualifications are manually checked. In addition, there is another problem in identifying through a related certificate, for example, in an android operating environment, since an android certificate can be generated by a developer by itself and does not need to be authenticated by an authoritative digital certificate signing authority, this method cannot ensure the accuracy of the acquired developer information, and even there is a case that the developer cannot be identified through a submitted certificate.
In order to solve the above problems, the embodiment of the present application provides a method for automatically identifying developer information based on network traffic analysis based on the system architecture shown in fig. 1, which can fully automatically detect developers of application files, thereby greatly reducing the missing rate of developer information of application files in an application market.
The following explains a developer identification method of an application file provided in an embodiment of the present application by way of a detailed embodiment. In addition, descriptions like first, second, third, fourth, and the like appearing in the following embodiments are only for distinguishing different objects, and do not constitute any other limitation.
Fig. 2 is a flowchart of a method for identifying a developer of an application file according to an embodiment of the present application. The main execution body of the method is the developer identification device shown in fig. 1, and referring to fig. 2, the method flow provided by the embodiment of the application includes:
201. And acquiring the application program file to be identified.
As shown in fig. 3, the application files to be identified are also referred to herein as application files to be analyzed. In one possible implementation, the application file to be analyzed may be an application file to be subjected to developer information analysis in an application market or an application store or an application mall.
202. Developer information matching the application files to be identified is queried in a first database.
In the embodiment of the present application, the first database refers to the application file developer library shown in fig. 1, and is used for storing the correspondence between the application file and the developer information, which may be shown in the foregoing table 3.
Referring to fig. 3, for an application file to be identified, a query is first performed in an application file developer library to determine whether corresponding developer information has been recorded in the database; if the corresponding developer information is recorded in the database, the processing flow is ended up to this point; if no corresponding developer information is recorded in the database, the following step 203 is performed.
203. And when the developer information matched with the application program file to be identified is not inquired, operating the application program file to be identified.
In the embodiment of the application, if the corresponding developer information is not recorded in the application file developer library, the application file to be identified is pushed to the specified directory of the simulator. Wherein the specified directory generally refers to an installation directory of the simulator.
The simulator can monitor whether a new application program file exists in the specified directory in real time; if a new application file exists in the specified directory, the simulator decompresses, installs and runs the application file, i.e., runs the application file to be identified.
204. And simulating the user to operate the application program file to be identified in the running state, and acquiring the network flow information generated in the running process of the application program file to be identified.
In the embodiment of the application, in the running process of the application to be identified, the simulator simulates the user to operate the application file. The method for simulating the user operation may be various, and this is not particularly limited in this embodiment of the application. As an example, a user operation may be simulated to randomly perform a click operation, or a keyboard input operation, or a gesture sliding operation, etc., on a display screen presented by the simulator; in addition, taking an android operating environment as an example, all activities may be triggered one by analyzing the content of the android manifest.
In the embodiment of the present application, the network flow information refers to a network access request generated during the operation of an application file.
When the application program file in the running state is simulated to be operated by a user, a plurality of network access requests can be generated by the application program file, the network agent serves as a transfer station of network information and can continuously monitor the network access requests generated by the application program file in the running process, when the network access requests collected by the network agent meet specific conditions, the simulator stops simulating the user to operate the application program file, otherwise, the simulator continuously simulates the user to operate the application program file.
In one possible implementation, the specific condition generally includes two dimensions, namely a number dimension and a time dimension. Namely, the step of acquiring the network traffic information generated in the running process of the application program file to be identified comprises the following two steps:
2041. acquiring network access requests generated in the running process of an application program file to be identified until the number of the collected network access requests reaches a preset threshold value; and taking the network access requests with the quantity reaching a preset threshold value as the network flow information.
In brief, for this step, when the number of collected URLs is more than M, the simulator stops simulating the user to operate the application file to be identified.
The preset threshold value, that is, the value of M, may be 100, which is not specifically limited in this embodiment of the present application. As an example, when the number of URLs collected by the network proxy reaches 100, the simulator stops simulating the user to operate the application file to be identified, and then the embodiment of the present application analyzes the developer information of the application file to be identified based on the 100 URLs.
2042. In the running process of an application program file to be identified, acquiring a network access request generated by the application program file within a preset time length; and taking the network access request generated within a preset time length as the network flow information.
For this step, in short, when the collected time exceeds the preset time length T, the simulator stops simulating the user to operate the application program file to be identified.
Wherein, the value of T may be 5 minutes, which is not specifically limited in this application embodiment. As an example, when the time for the network proxy to collect the URL exceeds 5 minutes, the simulator stops simulating the user to operate the application file to be identified, and then, the embodiment of the present application analyzes the developer information of the application file to be identified based on the URL collected in the 5 minutes.
In one possible implementation, during the running process of the application file to be identified, the network agent records the monitored network access request and stores the network access request record library shown in fig. 1, where the network access request record library is also referred to as a third database herein. As an example, the network access request recording manner by the network proxy may be as shown in table 1, which is not specifically limited in this embodiment of the present application. Namely, the embodiment of the application further comprises the following steps:
for each network access request collected by the network agent, establishing a corresponding relation between the application program file to be identified and the network access request, corresponding access time, corresponding domain name information and corresponding website operator information, and storing the established corresponding relation into a third database, namely a network access request record base.
205. And acquiring developer information of the application program file to be identified based on the network flow information.
In the embodiment of the application, the method for acquiring the developer information of the application program file to be identified based on the network traffic information includes the following steps:
2051. and acquiring the domain name information of each network access request in the network flow information.
For this step, for all URLs collected by the network, the embodiment of the present application may acquire domain name information thereof one by one. As an example, for a URL http:// news. abc. com/local/index. html, the domain name is abc.com.
2052. For each network access request, the second database is queried for website operator information that matches the domain name information for that network access request.
Wherein the second database is used to store the correspondence between the domain name and the website operator, i.e. the second database is referred to herein as the ICP docket library shown in fig. 1. That is, this step is used to find the website operator information in the ICP docket library.
2053. And counting the occurrence times of the same website operator information in all the acquired website operator information.
For this step, the application program file to be identified may access a plurality of different domain names in the running process, each domain name corresponds to a different website operator, and in order to determine developer information of the application program file to be identified, the present embodiment may count the occurrence times of the same website operator information.
2054. And taking the website operator information with the occurrence times meeting the preset conditions as developer information of the application program file to be identified.
In the embodiments of the present application, the preset conditions include, but are not limited to, the following two conditions:
firstly, selecting the website operator information with the largest occurrence frequency according to a majority of priority rules, and taking the website operator information as the developer information of the application program file to be identified. Alternatively, the website operator information with the first occurrence number is used as the developer information of the application file to be identified. As an example, taking the following table 4 as an example,
TABLE 4
Access URL Domain name information Website operator information
http://www.A.com/news/index.html A.com Company A
http://www.A.com/news/index2.html A.com Company A
https://www.F.com/page.html F.com Company F
https://www.A.com/index2.html A.com Company A
In table 4, three URLs belong to company a, one URL belongs to company F, and the developer of the application file to be identified is determined to be company a according to the majority of priority rules.
In addition, after determining the developer information of the application file to be identified, the embodiment of the present application further stores the determined developer information into the first database, i.e., the application file developer library, to form an information entry as shown in table 3 above.
The second type is to set multiple developer information simultaneously, namely, to allow multiple author tags to be set simultaneously for one application file. As an example, the website operator information with the first N occurrences may be used as the developer information of the application file to be identified, and the value of N is a positive integer. As shown in table 4 above, developers of application files to be identified may be determined as company a and company F.
The method provided by the embodiment of the application has at least the following beneficial effects:
1. the embodiment of the application realizes full-automatic identification of the developer of the application program file, does not need manual intervention, saves time and labor, and avoids the time-consuming and labor-consuming conditions existing in manual collection.
2. The method and the device for identifying the application program files are simple in deployment, physical equipment does not need to be additionally purchased, for example, android equipment does not need to be additionally purchased, developers of the application program files can be identified in a large batch in computer equipment such as a computer, and the cost is low.
3. According to the embodiment of the application program file identification method and device, the network flow information is analyzed to assist in identifying the developer of the application program file, the identification mode is accurate, and the missing rate or the error rate of the developer information of the application program file can be effectively reduced. That is, the embodiment of the application can avoid the situation that the developer information is missing or wrong due to incomplete or inaccurate manual collection, and the situation that the acquired developer information is inaccurate or even the developer information cannot be acquired.
In another embodiment, the following, with reference to fig. 3, combs an overall execution flow of the developer identification method for application files provided in the embodiment of the present application.
a. The method includes initiating a network agent, and initiating a simulator, the simulator setting the network agent.
The simulator sets the network agent, typically, a port number of the network agent, and the like, to communicate with the network agent.
b. For the application program file to be identified, judging whether corresponding developer information is stored in an application program file developer library or not; if not, executing the following step c; if so, the process flow ends so far.
c. And pushing the application program file to the specified directory of the simulator.
d. The simulator monitors whether new application files exist under the specified directory.
e. And if the new application program file exists, decompressing, installing and running the application program file.
f. In the running process of the application program file, the simulator simulates a user to operate the application program file, the network agent continuously monitors the network access request generated in the running process of the application program file, and the network agent is connected with the ICP record library to inquire information of a website operator, form a corresponding relation and store the corresponding relation to the network access request record library.
g. When the network agent collects a sufficient number of network access requests or waits for a sufficient time, the simulator stops simulating the user to operate the application file.
h. The developer of the application program file is identified based on the network access request collected by the network agent, and corresponding developer information is stored in an application program file developer library.
i. Uninstalling and deleting the identified application program files, and continuing to execute the step d.
In another embodiment, the embodiment of the present application can be applied to at least the following two scenarios by identifying the developer of the application file:
example one, when a developer of an application file is identified as a malicious developer, aggregating all application files related to the developer and performing a removal process on all application files related to the developer.
Aiming at the step, after a developer of an application program file is automatically analyzed based on network flow information, if the developer of the application program file is determined to have illegal behaviors, namely the developer is a malicious developer, the related application program file can be immediately downloaded in an application market or an application store to protect the safety of a user.
Taking fig. 4 and 5 as an example, referring to fig. 4, assuming that the network is suspected of illegal funding (belonging to network technologies, ltd), the application file shown in fig. 5 may be determined to be a harmful application file by matching the application file with the network technologies, ltd, and then the off-shelf process may be performed on the harmful application file. That is, the developer identification method provided by the embodiment of the application can assist in timely discovering harmful application program files in an application market or an application store or an application mall.
Second, after automatically analyzing a developer of an application file based on network traffic information, if it is determined that the developer of the application file is a security developer, that is, the application file submitted by the developer to an application market or an application store or an application mall is a normal application file, different application files of the same developer can be aggregated by performing automatic analysis and identification on the developer. When a certain user accesses one of the application program files, the application market or the application store or the application mall can recommend other application program files of the developer to the user, so that not only is the user experience improved, but also the user reach rate of the application program files related to the user is improved, and the application market or the application store or the application mall can be helped to attract more users and developers. As an example, referring to fig. 6, assuming that there are 5 application program files related to the same developer in a certain application store, which are respectively referred to as application a to application E, when a user browses or accesses application a therein, as shown in fig. 6, application B to application E may also be recommended to the user.
Fig. 7 is a schematic structural diagram of an apparatus for identifying a developer of an application file according to an embodiment of the present application. Referring to fig. 7, the apparatus includes:
A first obtaining module 701, configured to obtain an application file to be identified;
a query module 702, configured to query, in a first database, developer information matched with the application program file, where the first database is used to store a correspondence between the application program file and the developer information;
the first processing module 703 is configured to run the application program file when developer information matching the application program file is not queried;
the first processing module 703 is further configured to simulate a user to operate the application file in an operating state;
a second obtaining module 704, configured to obtain network traffic information generated in the running process of the application file;
the identifying module 705 is configured to obtain developer information of the application file based on the network traffic information.
The device provided by the embodiment of the application can firstly inquire developer information matched with the application program file to be identified in the first database after the application program file to be identified is obtained, when developer information matched with the application program file is not inquired in the first database, the application program file is operated and the user operation of the application program file in an operation state is simulated, then, network flow information generated in the operation process of the application program file is obtained, further, the developer information of the application program file is obtained based on the network traffic information, and based on the above description, the embodiment of the present application assists to identify the developer of the application program file by analyzing the network traffic information, the identification mode is accurate, and the loss rate or error rate of the developer information of the application program file can be effectively reduced. That is, the embodiment of the application can avoid the situation that the developer information is missing or wrong due to incomplete or inaccurate manual collection, and the situation that the acquired developer information is inaccurate or even the developer information cannot be acquired.
In a possible implementation manner, the second obtaining module 704 is further configured to obtain network access requests generated in the running process of the application file until the number of the collected network access requests reaches a preset threshold; and taking the network access requests with the number reaching the preset threshold value as the network flow information.
In a possible implementation manner, the second obtaining module 704 is further configured to obtain, in an operation process of the application program file, a network access request generated by the application program file within a preset time duration;
and taking the network access request generated in the preset time length as the network flow information.
In a possible implementation manner, the identifying module 705 is further configured to obtain domain name information of each network access request in the network traffic information; for each network access request, inquiring website operator information matched with the domain name information of the network access request in a second database, wherein the second database is used for storing the corresponding relation between the domain name and the website operator; counting the occurrence times of the same website operator information in the acquired website operator information; and taking the website operator information with the occurrence frequency meeting the preset condition as the developer information of the application program file to be identified.
In a possible implementation manner, the identifying module 705 is further configured to rank, as developer information of the application file, website operator information whose occurrence times are ranked first; or, the website operator information with the occurrence times ranked in the top N is used as the developer information of the application program file, and the value of N is a positive integer.
In one possible implementation, the apparatus further includes:
the storage module is used for establishing a corresponding relation between the application program file and the network access request, the corresponding access time, the corresponding domain name information and the corresponding website operator information for each network access request; and storing the established corresponding relation to a third database.
In one possible implementation, the apparatus further includes:
and the second processing module is used for aggregating the application program files related to the developers and executing removal processing on the application program files related to the developers when the developers of the application program files are malicious developers.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
It should be noted that: the device for identifying a developer of an application file provided in the above embodiments is only illustrated by the above division of each functional module when identifying a developer of an application file, and in practical applications, the above function allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the developer identification apparatus for the application program file and the developer identification method for the application program file provided in the foregoing embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments and are not described herein again.
Fig. 8 is a schematic structural diagram of a developer identification device for an application file according to an embodiment of the present application. The apparatus 800 may generate a large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, where the memory 802 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 801 to implement the developer identification method for the application file provided by the above-mentioned method embodiments. Of course, the device may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the device may also include other components for implementing the functions of the device, which are not described herein again.
In an exemplary embodiment, there is also provided a computer readable storage medium, such as a memory, comprising instructions executable by a processor in a terminal to perform the method of analysis report generation of a security event in the above embodiments. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (9)

1. A method for identifying a developer of an application file, the method comprising:
acquiring an application program file to be identified;
Inquiring developer information matched with the application program file in a first database, wherein the first database is used for storing the corresponding relation between the application program file and the developer information;
when developer information matched with the application program file is not inquired, the application program file is pushed to a specified directory of a simulator, the simulator is used for monitoring whether a new application program file exists in the specified directory, and the simulator is used for communicating with a network agent;
when the simulator determines that a new application program file exists in the specified directory, the application program file is operated;
simulating the application program file in a running state operated by a user through the simulator;
acquiring a network access request generated in the running process of the application program file through the network agent, and taking the network access request meeting a preset condition as network flow information;
acquiring domain name information of each network access request in the network traffic information;
for each network access request, inquiring website operator information matched with the domain name information of the network access request in a second database, wherein the second database is used for storing the corresponding relation between the domain name and the website operator;
Counting the occurrence times of the same website operator information in the acquired website operator information;
and taking the website operator information with the occurrence frequency meeting the preset condition as the developer information of the application program file to be identified.
2. The method according to claim 1, wherein the obtaining, by the network agent, the network access request generated during the running of the application file, and taking the network access request meeting a predetermined condition as the network traffic information comprises:
acquiring network access requests generated by the application program file in the running process until the number of the collected network access requests reaches a preset threshold value;
and taking the network access requests with the number reaching the preset threshold value as the network flow information.
3. The method according to claim 1, wherein the obtaining, by the network agent, the network access request generated during the running of the application file, and taking the network access request meeting a predetermined condition as the network traffic information comprises:
in the running process of the application program file, acquiring a network access request generated by the application program file within a preset time length;
And taking the network access request generated in the preset time length as the network flow information.
4. The method according to claim 1, wherein the step of using the website operator information whose occurrence number satisfies a preset condition as the developer information of the application file to be identified comprises:
the website operator information with the occurrence frequency ranked at the top is used as the developer information of the application program file; or the like, or, alternatively,
and the website operator information with the occurrence times ranked in the top N is used as the developer information of the application program file, and the value of N is a positive integer.
5. The method of claim 1, further comprising:
for each network access request, establishing a corresponding relation between the application program file and the network access request, corresponding access time, corresponding domain name information and corresponding website operator information;
and storing the established corresponding relation to a third database.
6. The method of claim 1, further comprising:
when the developer of the application program files is a malicious developer, the application program files related to the developer are aggregated, and removal processing is executed on the application program files related to the developer.
7. An apparatus for identifying a developer of an application file, the apparatus comprising:
the first acquisition module is used for acquiring an application program file to be identified;
the query module is used for querying developer information matched with the application program file in a first database, and the first database is used for storing the corresponding relation between the application program file and the developer information;
the first processing module is used for pushing the application program file to a specified directory of a simulator when developer information matched with the application program file is not inquired, the simulator is used for monitoring whether a new application program file exists in the specified directory, and the simulator is used for communicating with a network agent; when determining that a new application program file exists in the specified directory through the simulator, operating the application program file;
the first processing module is further used for simulating the application program file in the running state operated by the user through the simulator;
the second acquisition module is used for acquiring the network access request generated by the application program file in the running process through the network agent and taking the network access request meeting the preset conditions as network flow information;
The identification module is used for acquiring domain name information of each network access request in the network flow information; for each network access request, inquiring website operator information matched with the domain name information of the network access request in a second database, wherein the second database is used for storing the corresponding relation between the domain name and the website operator; counting the occurrence times of the same website operator information in the acquired website operator information; and taking the website operator information with the occurrence frequency meeting the preset condition as the developer information of the application program file to be identified.
8. An application file developer identification apparatus, the apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, the at least one instruction being loaded and executed by the processor to implement the application file developer identification method according to any one of claims 1 to 6.
9. A storage medium having stored therein at least one instruction, which is loaded and executed by a processor to implement the developer identification method of an application file according to any one of claims 1 to 6.
CN201910365066.3A 2019-04-30 2019-04-30 Application program file developer identification method, device, equipment and storage medium Active CN110213234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910365066.3A CN110213234B (en) 2019-04-30 2019-04-30 Application program file developer identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910365066.3A CN110213234B (en) 2019-04-30 2019-04-30 Application program file developer identification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110213234A CN110213234A (en) 2019-09-06
CN110213234B true CN110213234B (en) 2022-06-28

Family

ID=67786740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910365066.3A Active CN110213234B (en) 2019-04-30 2019-04-30 Application program file developer identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110213234B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990427A (en) * 2019-12-16 2020-04-10 北京智游网安科技有限公司 Statistical method, system and storage medium for application program affiliated area
CN111046062B (en) * 2019-12-16 2023-06-23 北京智游网安科技有限公司 Applet data acquisition method, intelligent terminal and storage medium
CN111125771B (en) * 2019-12-31 2023-01-17 联想(北京)有限公司 Method and device for protecting equipment privacy, electronic equipment and storage medium
CN111414304A (en) * 2020-03-18 2020-07-14 北京京安佳新技术有限公司 APP feature identification method and device
CN113031995B (en) * 2021-05-21 2021-09-14 北京神州泰岳智能数据技术有限公司 Rule updating method and device, storage medium and electronic equipment
CN113691492B (en) * 2021-06-11 2023-04-07 杭州安恒信息安全技术有限公司 Method, system, device and readable storage medium for determining illegal application program

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5297279A (en) * 1990-05-30 1994-03-22 Texas Instruments Incorporated System and method for database management supporting object-oriented programming
CN1842008A (en) * 2005-04-01 2006-10-04 国际商业机器公司 Method and system for providing customized content over a network
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website
CN103037312A (en) * 2011-10-08 2013-04-10 阿里巴巴集团控股有限公司 Message push method and message push device
CN107222369A (en) * 2017-07-07 2017-09-29 北京小米移动软件有限公司 Recognition methods, device, switch and the storage medium of application program
CN107622200A (en) * 2016-07-14 2018-01-23 腾讯科技(深圳)有限公司 The safety detecting method and device of application program

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100293524A1 (en) * 2009-05-12 2010-11-18 International Business Machines, Corporation Development environment for managing database aware software projects
US9015654B2 (en) * 2012-08-13 2015-04-21 Bitbar Technologies Oy System for providing test environments for executing and analysing test routines
US9632906B2 (en) * 2013-03-15 2017-04-25 Ca, Inc. Automated software system validity testing
CN107045508B (en) * 2016-02-05 2020-03-03 腾讯科技(深圳)有限公司 Application program processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5297279A (en) * 1990-05-30 1994-03-22 Texas Instruments Incorporated System and method for database management supporting object-oriented programming
CN1842008A (en) * 2005-04-01 2006-10-04 国际商业机器公司 Method and system for providing customized content over a network
CN103037312A (en) * 2011-10-08 2013-04-10 阿里巴巴集团控股有限公司 Message push method and message push device
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website
CN107622200A (en) * 2016-07-14 2018-01-23 腾讯科技(深圳)有限公司 The safety detecting method and device of application program
CN107222369A (en) * 2017-07-07 2017-09-29 北京小米移动软件有限公司 Recognition methods, device, switch and the storage medium of application program

Also Published As

Publication number Publication date
CN110213234A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110213234B (en) Application program file developer identification method, device, equipment and storage medium
CN109743315B (en) Behavior identification method, behavior identification device, behavior identification equipment and readable storage medium for website
CN111522922B (en) Log information query method and device, storage medium and computer equipment
US9215245B1 (en) Exploration system and method for analyzing behavior of binary executable programs
CN113489713B (en) Network attack detection method, device, equipment and storage medium
US9280665B2 (en) Fast and accurate identification of message-based API calls in application binaries
CN110688598B (en) Service parameter acquisition method and device, computer equipment and storage medium
US20130191918A1 (en) Identifying Trojanized Applications for Mobile Environments
CN107239701B (en) Method and device for identifying malicious website
CN112491602A (en) Behavior data monitoring method and device, computer equipment and medium
US9910724B2 (en) Fast and accurate identification of message-based API calls in application binaries
CN111835756A (en) APP privacy compliance detection method and device, computer equipment and storage medium
CN109815112B (en) Data debugging method and device based on functional test and terminal equipment
CN110851339A (en) Method and device for reporting buried point data, storage medium and terminal equipment
CN113360800A (en) Method and device for processing featureless data, computer equipment and storage medium
CN112017007A (en) User behavior data processing method and device, computer equipment and storage medium
CN110244963B (en) Data updating method and device and terminal equipment
CN109818972B (en) Information security management method and device for industrial control system and electronic equipment
CN111563015A (en) Data monitoring method and device, computer readable medium and terminal equipment
CN111563257A (en) Data detection method and device, computer readable medium and terminal equipment
CN114139161A (en) Method, device, electronic equipment and medium for batch vulnerability detection
CN108650123B (en) Fault information recording method, device, equipment and storage medium
Cui et al. Tracedroid: A robust network traffic analysis framework for privacy leakage in android apps
CN108234392B (en) Website monitoring method and device
CN110674426A (en) Webpage behavior reporting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant