CN113343238A - Application program identification method, device, storage medium and terminal - Google Patents

Application program identification method, device, storage medium and terminal Download PDF

Info

Publication number
CN113343238A
CN113343238A CN202110716221.9A CN202110716221A CN113343238A CN 113343238 A CN113343238 A CN 113343238A CN 202110716221 A CN202110716221 A CN 202110716221A CN 113343238 A CN113343238 A CN 113343238A
Authority
CN
China
Prior art keywords
application program
identified
target
characteristic information
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110716221.9A
Other languages
Chinese (zh)
Inventor
吴建文
梁海兴
帅朝春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202110716221.9A priority Critical patent/CN113343238A/en
Publication of CN113343238A publication Critical patent/CN113343238A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Stored Programmes (AREA)

Abstract

The application discloses an application program identification method, an application program identification device, a storage medium and a terminal, and relates to the technical field of application program development. Firstly, extracting first characteristic information of an application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified; then respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and finally, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.

Description

Application program identification method, device, storage medium and terminal
Technical Field
The present application relates to the field of application development technologies, and in particular, to an application identification method, an application identification device, a storage medium, and a terminal.
Background
With the development of science and technology, more and more terminals appear in people's lives, and the types and the number of applications are more and more, so the security of the applications becomes one of the important points of research of technicians in this field.
In the current application market, a large number of pirated applications exist, and such pirated applications are generally formed by packaging the original applications after malicious modification or virus embedding, but in the related art, the technical problems of large performance consumption and long identification time are generally existed in the identification of the applications.
Disclosure of Invention
The application provides an application program identification method, an application program identification device, a storage medium and a terminal, and can solve the technical problems that a large amount of performance is consumed and identification time is long when an application program is identified in the related art.
In a first aspect, an embodiment of the present application provides an application program identification method, where the method includes:
extracting first characteristic information of an application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified; respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold.
Optionally, before extracting the first feature information of the application to be identified, the method further includes: obtaining at least one sample application program, and extracting sample identity identification and sample characteristic information of each sample application program; and establishing a mapping relation between the sample identity of each sample application program and the sample characteristic information, and storing each mapping relation to a database.
Optionally, the obtaining second feature information of the target application program corresponding to the application program to be identified includes: acquiring a first identity identifier of the application program to be identified, and searching a first identity identifier matched with the first identity identifier in the database; and taking the first sample application program corresponding to the first sample identity as a target application program, and taking the first sample characteristic information corresponding to the first sample identity as second characteristic information of the target application program.
Optionally, the calculating the similarity between each piece of first feature information and the second feature information corresponding to each piece of first feature information respectively includes: respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information based on a similarity calculation formula;
the similarity calculation formula is as follows:
Figure BDA0003134173900000021
wherein a denotes the second feature information, B denotes the first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
Optionally, if the number of the target similarities in the similarities is greater than a preset number threshold, determining that the application program to be identified and the target application program are the same application program includes: and if the number of the target similarities in the similarities is larger than a preset number threshold and the first signature information of the application program to be identified is the same as the second signature information of the target application program, determining that the application program to be identified and the target application program are the same.
Optionally, the method further comprises: if the number of the target similarities in the similarities is smaller than or equal to a preset number threshold, determining that the application program to be identified and the target application program are not the same application program; and stopping running the application program to be identified and sending out warning information.
Optionally, the first characteristic information includes, but is not limited to, system component characteristics, authority characteristics, and resource file characteristics.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
the application provides an application program identification method, which comprises the steps of firstly extracting first characteristic information of an application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified; then respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and finally, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is an exemplary system architecture diagram of an application identification method according to an embodiment of the present application;
fig. 2 is a system interaction diagram of an application program identification method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an application program identification method according to an embodiment of the present application;
fig. 4 is a flowchart illustrating an application program identification method according to another embodiment of the present application;
fig. 5 is a schematic structural diagram of an application identification apparatus according to another embodiment of the present application;
fig. 6 is a schematic structural diagram of an application identification apparatus according to another embodiment of the present application;
fig. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
In order to make the features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
Fig. 1 is an exemplary system architecture diagram of an application identification method according to an embodiment of the present application.
As shown in fig. 1, the system architecture may include a terminal 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminals 101 and servers 103. Network 102 may include various types of wired or wireless communication links, such as: the wired communication link includes an optical fiber, a twisted pair wire or a coaxial cable, and the Wireless communication link includes a bluetooth communication link, a Wireless-Fidelity (Wi-Fi) communication link, a microwave communication link, or the like.
The terminal 101 may interact with the server 103 through the network 102 to receive messages from the server 103 or to send messages to the server 103. The terminal 101 may be hardware or software. When the terminal 101 is hardware, it can be a variety of electronic devices including, but not limited to, smart watches, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal 101 is software, it may be installed in the electronic devices listed above, and it may be implemented as multiple software or software modules (for example, for providing distributed services), or as a single software or software module, and is not limited in this respect.
The server 103 may be a business server providing various services. The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module, and is not limited in particular herein.
It should be understood that the number of terminals, networks, and servers in fig. 1 is merely illustrative, and that any number of terminals, networks, and servers may be used, as desired for an implementation.
Referring to fig. 2, fig. 2 is a system interaction diagram of an application program identification method according to an embodiment of the present application, it can be understood that, in the embodiment of the present application, an execution subject may be a terminal or a processor in the terminal, or may also be a service related to the application program identification method executed in the terminal, and for convenience of description, a system interaction process in the application program identification method is introduced with reference to fig. 1 and fig. 2 by taking the execution subject as the processor in the terminal as an example.
S201, the user runs the application program to be identified.
Any application program in the terminal can be determined as the application program to be identified according to the requirement.
S202, the processor extracts first characteristic information of the application program to be identified and obtains second characteristic information of a target application program corresponding to the application program to be identified.
Optionally, before extracting the first feature information of the application to be identified, the method further includes: obtaining at least one sample application program, and extracting sample identity identification and sample characteristic information of each sample application program; and establishing a mapping relation between the sample identity of each sample application program and the sample characteristic information, and storing each mapping relation to a database.
Optionally, the obtaining second feature information of the target application program corresponding to the application program to be identified includes: acquiring a first identity identifier of an application program to be identified, and searching a first identity identifier matched with the first identity identifier in a database; and taking the first sample application program corresponding to the first sample identity as the target application program, and taking the first sample characteristic information corresponding to the first sample identity as the second characteristic information of the target application program.
S203, the processor calculates the similarity between each piece of first characteristic information and the corresponding second characteristic information of each piece of first characteristic information.
Optionally, the calculating a similarity between each piece of first feature information and second feature information corresponding to each piece of first feature information respectively includes: respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information based on a similarity calculation formula;
the similarity calculation formula is as follows:
Figure BDA0003134173900000051
where a denotes second feature information, B denotes first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
And S204, if the number of the target similarity in the similarities is larger than a preset number threshold, the processor determines that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than the preset similarity threshold.
Optionally, if the number of the target similarities in the similarities is greater than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, including: and if the number of the target similarities in the similarities is larger than a preset number threshold and the first signature information of the application program to be identified is the same as the second signature information of the target application program, determining that the application program to be identified and the target application program are the same.
Optionally, if the number of the target similarities in the similarities is less than or equal to a preset number threshold, determining that the application program to be identified and the target application program are not the same application program; and stopping running the application program to be identified and sending out warning information.
Optionally, the first characteristic information includes, but is not limited to, system component characteristics, authority characteristics, resource file characteristics.
In the embodiment of the application, first characteristic information of an application program to be identified is extracted, and second characteristic information of a target application program corresponding to the application program to be identified is obtained; then respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and finally, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating an application program identification method according to an embodiment of the present disclosure.
As shown in fig. 3, the method includes:
s301, extracting first characteristic information of the application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified.
It can be understood that the application program identification method provided by the embodiment of the present application is applied to a terminal, wherein the terminal may be provided with various types of application programs, for example, a chat type application program, a video and audio type application program, or a shooting type application program, and a user may download the application programs in an application program market and install the application programs in the terminal. However, a large number of pirated applications exist in the current application market, such pirated applications are generally formed by performing malicious modification on the genuine applications or embedding viruses into the genuine applications and repackaging the genuine applications, and after the application is installed in the terminal, if the genuine pirated identification is not performed on the installed application in time, the pirated application is likely to run in the terminal, and in severe cases, a significant loss may be caused to the user.
In the embodiment of the application, on one hand, any one application program which needs to be identified can be used as the application program to be identified; (ii) a On the other hand, the application program which is specified or has specified characteristics can be used as the application program to be identified, for example, the application program which is installed and runs for the first time can be used as the target application program, and the application programs in the terminal can be identified as the application program to be identified at preset time intervals.
After the application program to be identified is determined to be in the running state, the feature information of the application program to be identified can be extracted in time and taken as the first feature information. The method for extracting the feature information of the application program to be identified may not be limited, the feature information of the application program to be identified may be regarded as information representing a terminal operation attribute or an operation state, the feature information is static feature information, the feature information may include, but is not limited to, a system component feature, an authority feature, and a resource file feature, and then the first feature information may also include, but is not limited to, a system component feature, an authority feature, and a resource file feature.
For example, in the android system, the system component feature may be an activity component feature for representing functionality; a service component feature for running services in the background without providing interface presentation; a Broadcast receiver (Broadcast Receive) component feature for receiving a Broadcast; the Content Provider (Content Provider) component feature supports storing and reading data in multiple applications, equivalent to a database. It is understood that the feature information in the embodiment of the present application may also be any other feature information in the terminal, and the foregoing examples should not be used as a limitation to the feature information.
Further, after the first feature information of the application program to be identified is extracted, in order to identify the application program to be identified, reference feature information of the first feature information may also be acquired, so as to compare the first feature information with the reference feature information. Therefore, in the application embodiment, the genuine application corresponding to the application to be identified may also be determined as the target application, and the feature information of the target application corresponding to the application to be identified is obtained and used as the second feature information, where the target application may be a genuine application developed by a manufacturer corresponding to the application to be identified, and then the second feature information in the target application may be used as the reference feature information of the first feature information in the application to be identified.
And S302, respectively calculating the similarity between each piece of first characteristic information and the corresponding second characteristic information of each piece of first characteristic information.
It can be understood that, after the second feature information is used as the reference feature information of the first feature information in the application program to be identified, each piece of first feature information may be compared with the corresponding second feature information of each piece of first feature information, and whether the application program to be identified and the target application program are the same application program may be determined according to the comparison result. The method for comparing each first feature information with the corresponding second feature information may not be specifically limited, and one possible implementation manner may be that a similarity between each first feature information and the corresponding second feature information of each first feature information may be separately calculated, where the similarity represents a degree of similarity between the corresponding second feature information of each first feature information, and the similarity may be a specific numerical value, so as to facilitate comparison between each first feature information and the corresponding second feature information.
And S303, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than the preset similarity threshold.
After determining the similarity between each piece of first feature information and the piece of second feature information corresponding to each piece of first feature information, each similarity may be compared with a preset similarity threshold, and the similarity greater than the preset similarity threshold is determined as a target similarity, where the preset similarity threshold is a minimum similarity that determines that the piece of first feature information and the piece of second feature information corresponding to each target similarity are the same feature information, and therefore, when one or more similarities are determined as the target similarity, the piece of first feature information corresponding to each target similarity and the piece of second feature information corresponding to the piece of first feature information may be regarded as the same feature information.
Because there may be a plurality of pieces of first feature information in the application program to be identified, the application program to be identified may also be accurately identified by adopting a discrimination manner of a majority voting mechanism, specifically, the number of target similarities in each similarity may be calculated, and the number is compared with a preset number threshold, where the preset number threshold is the minimum number for determining that the application program to be identified and the target application program are the same application program.
If the number of the target similarities in the similarities is larger than the preset number threshold, the fact that most of the feature information in the application program to be identified is consistent with the feature information in the target application program represents that the application program to be identified and the target application program are the same application program is determined, and the application program to be identified can be continuously operated; if the number of the target similarities in the similarities is equal to or less than the preset number threshold, which means that most of the feature information in the application program to be identified is inconsistent with the feature information in the target application program, it is determined that the application program to be identified and the target application program are not the same application program, and the application program to be identified may be a pirated application program, and at this time, if the application program to be identified is continuously operated, which is likely to cause a loss to the user, relevant measures may be taken to process the application program to be identified.
The similarity between each first characteristic information and the second characteristic information corresponding to each first characteristic information is calculated, so that the similar characteristic information between the application program to be identified and the target application program is determined, and finally whether the application program to be identified and the target application program are the same application program is determined according to the quantity of the similar characteristic information.
In the embodiment of the application, first characteristic information of an application program to be identified is extracted, and second characteristic information of a target application program corresponding to the application program to be identified is obtained; then respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and finally, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating an application program identification method according to another embodiment of the present application.
As shown in fig. 4, the method includes:
s401, obtaining at least one sample application program, and extracting sample identity identification and sample characteristic information of each sample application program.
In the embodiment of the application, in order to accurately and quickly acquire the second feature information of the target application corresponding to the application to be identified, a database for storing the second feature information of the target application may be set in advance.
Specifically, at least one sample application program may be obtained first, where the sample application program may be set as needed, and may be one or more genuine application programs in an application market, and then a sample identity and sample feature information of each sample application program are extracted, where the sample identity is an identity that can represent an identity of the sample application program, and for example, the sample identity may be a package name or a version number of the sample application program; the sample characteristic information is information representing attribute information or an operating state of the sample application, and the second characteristic information may also include, but is not limited to, system component characteristics, authority characteristics, and resource file characteristics.
S402, establishing a mapping relation between the sample identity of each sample application program and the sample characteristic information, and storing each mapping relation to a database.
Optionally, after the sample identity and the sample feature information of each sample application program are extracted, a mapping relationship may be established between the sample identity and the sample feature information of each sample application program, so that the sample identity and the sample feature information of each sample application program are in one-to-one correspondence, and each mapping relationship is stored in a database, so as to facilitate invocation of subsequent steps.
S403, extracting first characteristic information of the application program to be recognized, acquiring a first identity of the application program to be recognized, and searching a first identity matched with the first identity in a database.
When the application program to be recognized is detected to run, the first characteristic information of the application program to be recognized can be extracted, and then the identity of the application program to be recognized is obtained to serve as the first identity.
S404, taking the first sample application program corresponding to the first sample identity as the target application program, and taking the first sample characteristic information corresponding to the first sample identity as the second characteristic information of the target application program.
Optionally, after the first identity identifier is determined, the first sample application corresponding to the first identity identifier may be used as a target application, and according to the mapping relationship stored in the database, sample feature information corresponding to the first identity identifier may be searched, and the sample feature information may be used as first sample feature information, and finally, the first sample feature information may be used as second feature information of the target application.
In the above step, because the database stores the sample identity and the sample feature information of each sample application program, the second feature information of the target application program corresponding to the application program to be identified can be quickly obtained based on the database.
And S405, respectively calculating the similarity between each piece of first characteristic information and the second characteristic information corresponding to each piece of first characteristic information based on a similarity calculation formula.
In this embodiment of the application, the similarity between each piece of first feature information and the second feature information corresponding to each piece of first feature information may be calculated based on a similarity calculation formula, and specifically, the similarity calculation formula may be obtained according to a calculation formula of related data mining.
The similarity calculation formula is as follows:
Figure BDA0003134173900000101
where a denotes second feature information, B denotes first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
In the similarity calculation formula, the denominator is the second feature information, so that the influence of the anti-confusion and anti-interference features in the application program to be identified on the similarity calculation can be avoided.
S406, if the number of the target similarities in the similarities is larger than a preset number threshold, and the first signature information of the application program to be identified is the same as the second signature information of the target application program, determining that the application program to be identified and the target application program are the same.
After the number of the target similarities in the similarities is determined to be greater than the preset number threshold, signature information of the application program to be identified can be acquired as first signature information, wherein the signature information carries related information of a software developer, and therefore whether the application program is a pirate application program can also be determined through the signature information of the application program. After the signature information of the application program to be identified is obtained as the first signature information, the first signature information of the application program to be identified may be compared with the second signature information of the target application program, and if the first signature information of the application program to be identified is the same as the second signature information of the target application program, it is determined that the application program to be identified and the target application program are the same application program.
S407, if the number of the target similarities in the similarities is smaller than or equal to a preset number threshold, determining that the application program to be identified and the target application program are not the same application program.
If the number of the target similarities in the similarities is equal to or less than the preset number threshold, which means that most of the feature information in the application program to be identified is inconsistent with the feature information in the target application program, it is determined that the application program to be identified and the target application program are not the same application program, and the application program to be identified may be a pirated application program, and at this time, if the application program to be identified is continuously operated, which is likely to cause a loss to the user, relevant measures may be taken to process the application program to be identified.
And S408, stopping running the application program to be identified and sending out warning information.
Specifically, when the application program to be identified and the target application program are not the same application program, the application program to be identified may be stopped from running, loss caused to the user by continuing to run the application program to be identified is avoided, and warning information is issued to remind the user to know that the application program to be identified is a pirated application program.
In the embodiment of the application, first characteristic information of an application program to be identified is extracted, and second characteristic information of a target application program corresponding to the application program to be identified is obtained; then respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information; and finally, if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an application program identification device according to another embodiment of the present application.
As shown in fig. 5, the application recognition apparatus 500 includes:
the feature obtaining module 510 is configured to extract first feature information of the application program to be identified, and obtain second feature information of a target application program corresponding to the application program to be identified. Wherein the first characteristic information includes but is not limited to system component characteristics, authority characteristics, resource file characteristics.
The similarity calculating module 520 is configured to calculate similarities between each piece of first feature information and the second feature information corresponding to each piece of first feature information.
The comparing module 530 is configured to determine that the application program to be identified and the target application program are the same application program if the number of the target similarities in the similarities is greater than a preset number threshold, where the target similarities are greater than a preset similarity threshold.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an application identification device according to another embodiment of the present application.
As shown in fig. 6, the application recognition apparatus 600 includes:
the sample obtaining module 610 is configured to obtain at least one sample application, and extract a sample identity and sample feature information of each sample application.
And the data storage module 620 is configured to establish a mapping relationship between the sample identity of each sample application and the sample feature information, and store each mapping relationship in the database.
The first matching module 630 is configured to extract first feature information of the application program to be recognized, obtain a first identity of the application program to be recognized, and search a database for a first identity matching with the first identity.
And the second matching module 640 is configured to use the first sample application corresponding to the first sample identity as the target application, and use the first sample characteristic information corresponding to the first sample identity as the second characteristic information of the target application.
The calculating module 650 is configured to calculate similarity between each first feature information and the second feature information corresponding to each first feature information based on a similarity calculation formula.
The similarity calculation formula is as follows:
Figure BDA0003134173900000121
where a denotes second feature information, B denotes first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
The first verification module 660 is configured to determine that the application to be identified and the target application are the same application if the number of the target similarities in the similarities is greater than the preset number threshold and the first signature information of the application to be identified is the same as the second signature information of the target application.
The second verification module 670 is configured to determine that the application to be identified and the target application are not the same application if the number of the target similarities in the similarities is smaller than or equal to the preset number threshold.
And the reminding module 680 is used for stopping running the application program to be identified and sending out warning information.
In an embodiment of the present application, an application identification apparatus includes: the characteristic acquisition module is used for extracting first characteristic information of the application program to be identified and acquiring second characteristic information of a target application program corresponding to the application program to be identified; the similarity calculation module is used for calculating the similarity between each piece of first characteristic information and the second characteristic information corresponding to each piece of first characteristic information; and the comparison module is used for determining that the application program to be identified and the target application program are the same application program if the number of the target similarity in the similarity is greater than a preset number threshold, wherein the target similarity is greater than a preset similarity threshold. Because the characteristic similarity between the target application programs of the application programs to be identified is calculated to identify whether the application programs to be identified and the target application programs are the same application programs, the calculation amount and the identification time for identifying the application programs can be reduced, and the identification efficiency of the application programs is improved.
Embodiments of the present application also provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any of the above embodiments.
Further, please refer to fig. 7, where fig. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application. As shown in fig. 7, the terminal 700 may include: at least one central processor 701, at least one network interface 704, a user interface 703, a memory 705, at least one communication bus 702.
Wherein a communication bus 702 is used to enable connective communication between these components.
The user interface 703 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 703 may also include a standard wired interface and a standard wireless interface.
The network interface 704 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The central processor 701 may include one or more processing cores. The central processor 701 connects various parts within the entire terminal 700 using various interfaces and lines, and performs various functions of the terminal 700 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 705, and calling data stored in the memory 705. Optionally, the central Processing unit 701 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The CPU 701 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the cpu 701, and may be implemented by a single chip.
The Memory 705 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 705 includes a non-transitory computer-readable medium. The memory 705 may be used to store instructions, programs, code sets, or instruction sets. The memory 705 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 705 may optionally be at least one memory device located remotely from the central processor 701. As shown in fig. 7, the memory 705, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an application program identification program.
In the terminal 700 shown in fig. 7, the user interface 703 is mainly used to provide an input interface for a user to obtain data input by the user; the central processing unit 701 may be configured to call the application program identifier stored in the memory 705, and specifically perform the following operations:
extracting first characteristic information of an application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified;
respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information;
and if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold.
Optionally, before extracting the first feature information of the application to be identified, the method further includes: obtaining at least one sample application program, and extracting sample identity identification and sample characteristic information of each sample application program; and establishing a mapping relation between the sample identity of each sample application program and the sample characteristic information, and storing each mapping relation to a database.
Optionally, the obtaining second feature information of the target application program corresponding to the application program to be identified includes: acquiring a first identity identifier of an application program to be identified, and searching a first identity identifier matched with the first identity identifier in a database; and taking the first sample application program corresponding to the first sample identity as the target application program, and taking the first sample characteristic information corresponding to the first sample identity as the second characteristic information of the target application program.
Optionally, the calculating a similarity between each piece of first feature information and second feature information corresponding to each piece of first feature information respectively includes: respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information based on a similarity calculation formula;
the similarity calculation formula is as follows:
Figure BDA0003134173900000151
where a denotes second feature information, B denotes first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
Optionally, if the number of the target similarities in the similarities is greater than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, including: and if the number of the target similarities in the similarities is larger than a preset number threshold and the first signature information of the application program to be identified is the same as the second signature information of the target application program, determining that the application program to be identified and the target application program are the same.
Optionally, the method further comprises: if the number of the target similarities in the similarities is smaller than or equal to a preset number threshold, determining that the application program to be identified and the target application program are not the same application program; and stopping running the application program to be identified and sending out warning information.
Optionally, the first characteristic information includes, but is not limited to, system component characteristics, authority characteristics, resource file characteristics.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the application program identification method, apparatus, storage medium and terminal provided by the present application, those skilled in the art will recognize that there are variations from the embodiments of the present application to the specific implementation and application scope, and in summary, the content of the present application should not be construed as a limitation to the present application.

Claims (10)

1. An application identification method, the method comprising:
extracting first characteristic information of an application program to be identified, and acquiring second characteristic information of a target application program corresponding to the application program to be identified;
respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information;
and if the number of the target similarities in the similarities is larger than a preset number threshold, determining that the application program to be identified and the target application program are the same application program, wherein the target similarity is larger than a preset similarity threshold.
2. The method according to claim 1, wherein before extracting the first feature information of the application to be identified, the method further comprises:
obtaining at least one sample application program, and extracting sample identity identification and sample characteristic information of each sample application program;
and establishing a mapping relation between the sample identity of each sample application program and the sample characteristic information, and storing each mapping relation to a database.
3. The method according to claim 2, wherein the obtaining second feature information of the target application corresponding to the application to be identified includes:
acquiring a first identity identifier of the application program to be identified, and searching a first identity identifier matched with the first identity identifier in the database;
and taking the first sample application program corresponding to the first sample identity as a target application program, and taking the first sample characteristic information corresponding to the first sample identity as second characteristic information of the target application program.
4. The method according to claim 1, wherein the calculating the similarity between each piece of first feature information and the second feature information corresponding to each piece of first feature information respectively comprises:
respectively calculating the similarity between each piece of first characteristic information and second characteristic information corresponding to each piece of first characteristic information based on a similarity calculation formula;
the similarity calculation formula is as follows:
Figure FDA0003134173890000021
wherein a denotes the second feature information, B denotes the first feature information, a ≠ B denotes an intersection between the first feature information and the second feature information, and J (a, B) denotes a similarity between the first feature information and the second feature information.
5. The method according to any one of claims 1 to 4, wherein the determining that the application program to be identified and the target application program are the same application program if the number of target similarities in the similarities is greater than a preset number threshold includes:
and if the number of the target similarities in the similarities is larger than a preset number threshold and the first signature information of the application program to be identified is the same as the second signature information of the target application program, determining that the application program to be identified and the target application program are the same.
6. The method according to any one of claims 1 to 4, further comprising:
if the number of the target similarities in the similarities is smaller than or equal to a preset number threshold, determining that the application program to be identified and the target application program are not the same application program;
and stopping running the application program to be identified and sending out warning information.
7. The method according to any one of claims 1 to 4, wherein the first characteristic information includes, but is not limited to, system component characteristics, rights characteristics, resource file characteristics.
8. An application recognition apparatus, the apparatus comprising:
the system comprises a characteristic acquisition module, a characteristic analysis module and a characteristic analysis module, wherein the characteristic acquisition module is used for extracting first characteristic information of an application program to be identified and acquiring second characteristic information of a target application program corresponding to the application program to be identified;
the similarity calculation module is used for calculating the similarity between each piece of first characteristic information and the second characteristic information corresponding to each piece of first characteristic information;
and the comparison module is used for determining that the application program to be identified and the target application program are the same application program if the number of the target similarity in the similarities is larger than a preset number threshold, wherein the target similarity is larger than a preset similarity threshold.
9. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any of claims 1 to 7.
10. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the program implementing the steps of the method according to any of claims 1 to 7.
CN202110716221.9A 2021-06-25 2021-06-25 Application program identification method, device, storage medium and terminal Pending CN113343238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110716221.9A CN113343238A (en) 2021-06-25 2021-06-25 Application program identification method, device, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110716221.9A CN113343238A (en) 2021-06-25 2021-06-25 Application program identification method, device, storage medium and terminal

Publications (1)

Publication Number Publication Date
CN113343238A true CN113343238A (en) 2021-09-03

Family

ID=77478997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110716221.9A Pending CN113343238A (en) 2021-06-25 2021-06-25 Application program identification method, device, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN113343238A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416212A (en) * 2018-03-01 2018-08-17 腾讯科技(深圳)有限公司 Method for identifying application program and device
CN109446753A (en) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 Detect method, apparatus, computer equipment and the storage medium of pirate application program
CN111460449A (en) * 2020-03-10 2020-07-28 北京邮电大学 Application program identification method, system, storage medium and electronic device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416212A (en) * 2018-03-01 2018-08-17 腾讯科技(深圳)有限公司 Method for identifying application program and device
CN109446753A (en) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 Detect method, apparatus, computer equipment and the storage medium of pirate application program
CN111460449A (en) * 2020-03-10 2020-07-28 北京邮电大学 Application program identification method, system, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN108804299B (en) Application program exception handling method and device
CN107391359B (en) Service testing method and device
CN108427731B (en) Page code processing method and device, terminal equipment and medium
CN108614970B (en) Virus program detection method, model training method, device and equipment
CN105357204B (en) Method and device for generating terminal identification information
CN109491733B (en) Interface display method based on visualization and related equipment
US10909232B2 (en) Terminal verification method, terminal device, and computer readable storage medium
CN112153582B (en) Verification code short message display method and device
CN111339531A (en) Malicious code detection method and device, storage medium and electronic equipment
CN110599581B (en) Image model data processing method and device and electronic equipment
CN112000884A (en) User content recommendation method and device, server and storage medium
CN111597553A (en) Process processing method, device, equipment and storage medium in virus searching and killing
US10915666B2 (en) Terminal verification method, terminal device, and computer readable storage medium
CN113918949A (en) Recognition method of fraud APP based on multi-mode fusion
CN108062401B (en) Application recommendation method and device and storage medium
CN111913743B (en) Data processing method and device
CN110503504B (en) Information identification method, device and equipment of network product
CN111722994A (en) Task request response method and device
CN113343238A (en) Application program identification method, device, storage medium and terminal
CN106648671B (en) Application upgrading method and terminal
CN109005469A (en) A kind of conversion method of message format, device, storage medium and android terminal
CN110262856B (en) Application program data acquisition method, device, terminal and storage medium
CN114240663A (en) Data reconciliation method, device, terminal and storage medium
CN114398994A (en) Method, device, equipment and medium for detecting business abnormity based on image identification
CN113760315A (en) Method and device for testing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination