CN109726554B

CN109726554B - Malicious program detection method and device

Info

Publication number: CN109726554B
Application number: CN201711037144.4A
Authority: CN
Inventors: 高坤; 邰靖宇; 刘宇豪; 潘宣辰; 马志远
Original assignee: Wuhan Antiy Information Technology Co ltd
Current assignee: Wuhan Antiy Information Technology Co ltd
Priority date: 2017-10-30
Filing date: 2017-10-30
Publication date: 2021-05-18
Anticipated expiration: 2037-10-30
Also published as: CN109726554A

Abstract

The embodiment of the invention provides a method, a device and related applications for detecting a malicious program, which are used for acquiring at least one character string of a preset position of a program to be detected; performing randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected; and when the random value of the program to be detected is greater than the preset threshold value, judging that the program to be detected is a malicious program. The technical scheme of the invention can accurately and effectively identify the malicious codes which adopt the random character strings to resist the security software, can solve the problem of extreme expansion of the feature library caused by the dependence on the feature library in the traditional searching and killing, has higher processing efficiency, does not depend on specific features, and can effectively detect the automatically generated malicious programs.

Description

Malicious program detection method and device

Technical Field

The invention belongs to the field of program detection, and particularly relates to a method and a device for detecting a malicious program.

Background

With the rapid development of the mobile internet in recent years, the platform security problem is increased day by day. Especially, the Android platform is the most prominent, and the black industry chain driven by huge benefits is hidden under the appearance of prosperous ecology circle. The whole ecology of Android grows more and more, the relevant black industrial chain grows more and more rampant, the virus on the Android platform is more and more, and the quantity grows almost exponentially.

Traditional malicious program searching and killing mainly depends on feature library patterns. The feature library is composed of feature codes of malicious program samples collected by manufacturers, wherein the feature codes can be understood as feature codes which are found from malicious programs and are distinguished from normal software. In the process of checking and killing, the engine reads the file and matches with all the feature codes in the feature library, and if the file program code is found to be hit, the file program can be judged to be a malicious program.

For example, the patent of Beijing Qihu technology GmbH, a virus APK identification method and device (application number: 201210076889.2, publication number: 102663286B) adopts opcode and class name function names as features, and hackers adopt a confusion method when killing is avoided, so that corresponding features become random character strings, and therefore a large number of random character strings replace original function names, and further expansion of a feature library is caused, the larger the volume of the feature library is, the lower the matching efficiency is, and finally, the failure of the traditional feature library is caused.

Disclosure of Invention

In view of the above problems, the present invention is proposed to provide a malicious program detection method and apparatus that overcome the above problems.

In a first aspect, an embodiment of the present invention provides a method for detecting a malicious program, including:

acquiring at least one character string of a preset position of a program to be detected;

performing randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected;

and when the random value of the program to be detected is greater than the preset threshold value, judging that the program to be detected is a malicious program.

Further, the method for presetting the threshold value comprises the following steps:

predefining two character string sets, including a non-random character string set and a malicious random character string set, and respectively performing randomness calculation on all character strings in the two character string sets according to a predefined rule, wherein the minimum random value in the non-random character string set is a first random value, and the maximum random value in the malicious random character string set is a second random value; when the first random value is less than the second random value, the threshold is: a second random value; or, an average of the first random value and the second random value.

Further, the method for presetting the threshold value further comprises the following steps:

predefining a normal random character string set, and calculating the randomness of all character strings in the character string set according to the predefined rule, wherein the maximum random value is a third random value, and when the third random value is larger than the first random value and smaller than the second random value, the threshold value is the third random value

Further, the method for acquiring at least one character string of the preset position of the program to be detected comprises the following steps: and extracting a character string from at least one of the package name, the signature, the program name, the version number, the file name and the file content of the program to be detected.

Further, the method for calculating the randomness of the character string according to the predefined rule includes: an N-Gram algorithm and an information entropy algorithm.

Further, the method for acquiring at least one character string at the preset position of the program to be detected comprises the following steps: the line feed character is used as an identifier, and a line character is a character string.

Further, the characters are at least one of English, numbers, symbols or a mixture thereof.

Further, when the program to be detected is judged to be a malicious program, adding at least one acquired character string at the preset position of the program to be detected into the malicious random character string set.

In a second aspect, an embodiment of the present invention provides an apparatus for detecting a malicious program, including:

the acquisition module is used for acquiring at least one character string at a preset position of the program to be detected;

the random value calculation module is used for carrying out randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected;

and the comparison and judgment module is used for judging the program to be detected as a malicious program when the random value of the program to be detected is greater than a preset threshold value.

In a third aspect, an embodiment of the present invention provides a device for detecting a malicious program, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

In a fourth aspect, embodiments of the present invention provide a non-transitory computer-readable storage medium, where instructions of the storage medium, when executed by a processor of a mobile terminal, enable the mobile terminal to perform a method for detecting a malicious program as described above.

The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:

the embodiment of the invention provides a method and a device for detecting a malicious program,

acquiring at least one character string at a preset position of a program to be detected, and performing randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected; and when the random value of the program to be detected is greater than the preset threshold value, judging that the program to be detected is a malicious program. The technical scheme of the invention can accurately and effectively identify the malicious codes which adopt the random character strings to resist the security software, can solve the problem of extreme expansion of the feature library caused by the dependence on the feature library in the traditional searching and killing, has higher processing efficiency, does not depend on specific features, and can effectively detect the automatically generated malicious programs.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart of a method for detecting a malicious program according to an embodiment of the present invention;

FIG. 2A is a flow chart of threshold generation provided by an embodiment of the present invention;

FIG. 2B is a flow chart of another threshold generation provided by an embodiment of the present invention;

FIG. 3 is a diagram illustrating random values of character string sets according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a process of calculating randomness of a character string according to an embodiment of the present invention;

fig. 5 is a block diagram of a malicious program detection apparatus according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

According to practical experience, signatures, package names, etc. of normal applications need to be stable to maintain normal updates and iterations. And malicious codes tend to use random character strings, so that security manufacturers are prevented from searching and killing the codes by taking signatures or package names and the like as features. In addition, the signature and the package name of the normal application are often provided with personal information of the author or company, which accords with the statistical rules of the text.

According to the text statistical law (the frequency analysis theory of letters), the following results are obtained: in any one written language, different letters or combinations of letters appear with different frequencies. Moreover, any piece of text written in this language has approximately the same characteristic letter distribution. For example, in English, the letter E appears more frequently, while X appears less frequently. Similarly, ST, NG, TH, and QU combinations occur very frequently with very few NZ, QJ combinations.

Therefore, if the character string is non-random, the rule should be satisfied, and the random character string generally does not satisfy the rule. Therefore, the random value of the non-random string should not be high, while the random value of the random string would be high.

Based on the above theory, in the present application, the malicious character string is definitely a random character string; the non-malicious strings (i.e., normal random) may be random or non-random, and if random, the randomness is not higher than that of the malicious strings.

For a program for detecting randomness, only a random character string and a non-random character string are classified, in the description process of the technical scheme of the invention, the random character string comprises a normal random character string (such as Eye two a went bar) and a malicious random character string (such as gffvdugghjguysertftyfy), and the non-random character string can be understood as a normal semantic character string (such as I went to opacity).

The embodiment of the invention provides a method for detecting a malicious program, which comprises the following steps of S101 to S103:

s101, at least one character string of a preset position of a program to be detected is obtained.

In order to combat security manufacturers, malicious programs or applications are usually killed by means of shelling, obfuscation and the like, so that the previous universality and universality of the killing means which adopt fixed character strings such as class names, package names, method names, character strings and the like as feature codes are lost, and meanwhile, in order to cover typical features of malicious code authors such as signature package names and malicious files such as Android Package (APK) files, random character strings such as random package names and signatures are gradually used for avoiding the killing of security software, so that security manufacturers are extremely passive in the presence of a rapidly expanding feature library. Conventional developers keep the consistency of products stable at the level of signature, package name, code and the like. Therefore, it is preferable to extract the character string at random at a position such as a package name, a signature, a program name, a version number, a file name, or a file content of the program to be detected. To improve the accuracy of detection, the character string may be selected at several positions as much as possible.

For example, for an APK (which may be regarded as a zip-format file) after decompression, the method includes:

dex, in dex format;

arsc in arsc format;

xml in xml format;

4. other format files.

The content in the file can be acquired, and the information of the APK package can be acquired: version number, name, package name, signature, icon, etc.; the source code of the program or application may also be obtained through decompilation or other means, and the character string or the like may be obtained from the source code. The embodiment of the present disclosure does not limit what manner of obtaining the character string.

In this embodiment, the characters may be english, numbers, symbols, or a mixture thereof: such as:

"abcdefghijklmnopqrsttuvwxyz 012345679 ═ @ -" or "acegikmoqssuwy", or a character string ordered from small to large according to ASCII code, and the like, which is not limited in the embodiment of the present disclosure.

The line feed character is used as an identifier, and a line character is a character string. For a signature of an APP, it can be considered to comprise a string:

CN＝JieLv,OU＝HangzhouFeiniu Science&Technology Co.Ltd.,O＝HangzhouFeiniu Science&TechnologyCo.Ltd.,L＝HangZhou,ST＝ZheJiang,C＝86

for an article, which has a length of N lines, N strings are considered.

S102, performing randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected.

The method for performing randomness calculation on the character string includes an N-Gram algorithm, an information entropy algorithm, and the like, which is not limited in this embodiment.

S103, when the random value of the program to be detected is larger than a preset threshold value, judging that the program to be detected is a malicious program.

There are various methods for presetting the threshold, such as:

the first method comprises the following steps:

and S1031, predefining two character string sets according to the plurality of character strings with known attributes, wherein the two character string sets comprise a non-random character string set and a malicious random character string set.

And S1032, respectively performing randomness calculation on all the character strings in the two character string sets according to the same rule as the rule of S102, wherein the minimum random value in the non-random character string set is a first random value, and the maximum random value in the malicious random character string set is a second random value. The threshold may be: the second random value, the method has high accuracy; or the average value of the first random value and the second random value, the method has high efficiency and can cover most of the random value conditions of the character strings. The method for calculating the average value includes a simple arithmetic mean method, a weighted arithmetic mean method, a moving average method, or an exponential smooth average method, etc., and the method is not limited in the embodiments of the present disclosure.

With reference to fig. 3, it can be understood that if the first random value is greater than the second random value, it indicates that the set of non-random strings and the set of random strings as the training set are not representative and are not available.

And the second method comprises the following steps:

and S1031', predefining three character string sets according to the plurality of character strings with known attributes, wherein the three character string sets comprise a non-random character string set, a malicious random character string set and a normal random character string set.

S1032' performing stochastic calculation on all the character strings in the three character string sets according to the predefined rule, where a minimum random value in the non-random character string set is a first random value, a maximum random value in the malicious random character string set is a second random value, a maximum random value in the normal random character string set is a third random value, and the third random value is a threshold value.

Of course, in order to improve the detection accuracy, it is preferable to combine other detection means when applying the above method.

Referring to fig. 3, it can be understood that the method can be used only when the third random value is greater than the first random value and less than the second random value, otherwise, the training set is not representative and is not suitable for use.

The established character string set has a good learning function and high processing efficiency, the technical scheme of the invention does not adopt a feature library, and the purpose of effectively identifying malicious codes of the anti-security software adopting the random character strings is achieved according to the comparison of the random value of the program to be detected and the preset threshold value, the problem of extreme expansion of the feature library caused by traditional checking and killing is solved, and the method has high processing efficiency, does not depend on specific features, and can effectively detect the automatically generated malicious programs. According to the detection result, the user can be reminded to check, kill or uninstall, and the application program can be directly isolated.

The technical scheme of the invention can detect the software installed on a computer and also can detect the application installed on a mobile phone, and the programs mentioned in the embodiment of the disclosure include but are not limited to the software and the application installed on various terminals.

In one embodiment, the randomness calculation is performed by using an N-Gram algorithm for the character strings in step S101 of fig. 1 and in fig. 2A and 2B, and as shown in fig. 4, the method includes the following steps:

s201, performing word segmentation on the character string to obtain all N character word segments corresponding to the character string; n is a positive integer;

s202, matching the occurrence frequency of all N character participles of a character string in a preset characteristic array to obtain a frequency array corresponding to the character string, wherein the frequency array comprises the frequency corresponding to all the N character participles of the character string respectively;

s203, calculating the average value of the frequency array, and using the average value of the frequency array to calculate an index for a constant e to obtain a random value corresponding to the character string;

the feature array preset in step S202 may be generated as follows:

performing pattern matching calculation on a preset ordered character string to generate a feature array; the feature array contains the frequency of occurrence of the N characters in the ordered string.

When an N-Gram algorithm is adopted, for example, 2-Gram calculation is used to obtain the occurrence frequency of two adjacent characters, and the frequency is collected to generate a feature array; the 3-Gram can also be used for pattern matching, theoretically, as long as a character string is long enough, the larger N is, the better N is, the more information is considered, but data sparseness is easy to generate, the law of large numbers is not satisfied, and the calculated probability is distorted. On the other hand, if N is large, the parameter space is too large, dimension disaster is generated, and the method cannot be used practically. Assuming that the size of the character string is 100,000, the number of parameters of the N-Gram model is 100,000N. With such many parameters, the memory required for calculation is not sufficient. In the specific implementation, the problem can be solved by using 2-Gram, 3-Gram is not used generally, and the condition that N is more than or equal to 4 is less. The numerical value of N is not limited in the embodiments of the present disclosure.

For example, taking the characters contained in "abcdefghijklmnopqrstuvwxyz 012345679 ═ @ -" as the statistical reference, or taking N-Gram as an example, we count the number of occurrences of 2 adjacent characters (i.e. meaning of 2 in "2-Gram participle") in a large number of normal semantic articles, such as: aa. ab, ac …, ba, bb, bc …, etc., the frequency of occurrence of these characters, referring to table one below, the value of a in the first row and a in the first column is 31, meaning that "aa" occurs 31 times in statistics, the value of b in the second row and c in the third column is 168, meaning that "bc" occurs 168 times in statistics … as described above, and then the set of frequencies is recorded and generated to generate the feature array; thus we basically get the probability that two adjacent characters should appear in the case of normal semantics. Wherein the calculation is similar to that of 2-Gram by using 3-Gram participles. And referring to the word segmentation and word frequency result shown in the table I.

Watch 1

	a	b	c	……	@	-
								a	31	7910	16166	……	10	336	26888
b	5708	429	168	……	10	55	642
								c	17916	10	3090	……	10	40	3023
……	……	……	……	……	……	……	……
								@	10	10	10	……	10	10	10
-	1058	468	605	……	10	6049	58
									119739	45051	41880	……	10	55	34700

For another example: in the sentence "this is a dog", th "appears 1 time," hi appears 1 time, "is" appears 2 times, "_ i" appears 1 time, "s _" appears 1 time, "_ a" appears 1 time, "a _" appears 1 time, "_ d" appears 1 time, "do" appears 1 time, "og" appears 1 time, the above "______ (underlined) represents a space, which is also treated as a character in the word frequency; the frequency of occurrence of the above-mentioned double-character participle can be matched with the corresponding numerical value in the above-mentioned table one, the obtained numerical value is used for calculating the average value thereof, and then the constant e is indexed by using the average value, so that the random value corresponding to the character string of "this is a dog" can be obtained. Referring to formula one:

e^x＝N

(where: the constant e is about 2.71828, N represents the average, and x is an exponential, i.e., random value)

Formula one

In this embodiment, in order to obtain a clear observation output result for a small frequency change, the constant e may be exponential calculated by using a frequency average, or other manners may be adopted, for example, the frequency average is directly used as a random value, so that values corresponding to the random value are relatively large, and the idea is completely the same as that of the present scheme. The embodiments of the present disclosure do not limit this.

In one embodiment, when the program to be detected is judged to be a malicious program, at least one acquired character string of the preset position of the program to be detected is added into a malicious random character string set. Therefore, the malicious sample library can be expanded to improve the accuracy of subsequent malicious program detection.

Based on the same inventive concept, the embodiment of the present invention further provides a device for detecting a malicious program, and as the principle of the problem solved by the device is similar to that of the method for detecting a malicious program in the foregoing embodiment, reference may be made to the implementation of the foregoing method for the implementation of the device, and repeated details are not repeated.

The following is a device for detecting a malicious program according to an embodiment of the present invention, which can be used to execute the embodiment of the method for detecting a malicious program.

Referring to fig. 5, the apparatus includes:

an obtaining module 41, configured to obtain at least one character string of a preset position of a program to be detected;

a random value calculation module 42, configured to perform randomness calculation on the character string according to a predefined rule, so as to generate a random value of the program to be detected;

and the comparison and judgment module 43 is configured to judge that the program to be detected is a malicious program when the random value of the program to be detected is greater than the preset threshold.

In one embodiment, as shown in fig. 4, the method further includes: and a threshold calculation module 44, configured to predefine two character string sets, including a non-random character string set and a malicious random character string set, and perform randomness calculation on all character strings in the two character string sets according to predefined rules, where a minimum random value in the non-random character string set is a first random value, and a maximum random value in the malicious random character string set is a second random value. The threshold value may be set to the second random value or an average of the first random value and the second random value.

In one embodiment, referring to fig. 4, the threshold calculation module 44 is further configured to predefine a normal random string set, and perform randomness calculation on all strings in the string set according to the predefined rule, where the maximum random value is a third random value, that is, the third random value.

In one embodiment, the method for acquiring at least one character string of the preset position of the program to be detected by the acquisition module 41 includes: and extracting a character string from at least one of the package name, the signature, the program name, the version number, the file name and the file content of the program to be detected. The characters are at least one of English, numbers and symbols. The line feed character is used as an identifier, and a line character is a character string.

In one embodiment, the randomness calculation module 42 or/and the threshold calculation module 44 performs randomness calculations on the string according to an N-Gram algorithm, an information entropy algorithm, or the like.

In one embodiment, the random value calculation module 42 or the threshold calculation module 44 is configured to perform randomness calculation on the character string by: segmenting the character string to obtain all N character segments corresponding to the character string; n is a positive integer; matching the occurrence frequency of all N character participles of the character string in a preset characteristic array to obtain a frequency array corresponding to the character string, wherein the frequency array comprises the frequency corresponding to all the N character participles of the character string; calculating the average value of the frequency array, and using the average value of the frequency array to calculate an index for a constant e to obtain a random value corresponding to the character string;

the preset feature array is generated in the following way:

performing pattern matching calculation on a preset ordered character string to generate a feature array; the feature array contains the frequency of occurrence of N characters in the ordered string.

In one embodiment, when the comparison and judgment module 43 judges that the program to be detected is a malicious program, at least one character string of the preset position of the program to be detected acquired by the acquisition module 41 is added to the malicious random character string set of the threshold calculation module 44.

According to a third aspect of the embodiments of the present disclosure, an embodiment of the present disclosure provides a device for detecting a malicious program, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for detecting a malicious program, comprising:

performing randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected; the method for calculating the randomness of the character string according to the predefined rule comprises the following steps: an N-Gram algorithm; wherein the N-Gram algorithm comprises the following steps:

segmenting the character string to obtain all N character segments corresponding to the character string; n is a positive integer;

matching the occurrence frequency of all N character participles of the character string in a preset characteristic array to obtain a frequency array corresponding to the character string, wherein the frequency array comprises the frequency corresponding to all the N character participles of the character string;

calculating the average value of the frequency array, and using the average value of the frequency array to calculate an index for a constant e to obtain a random value corresponding to the character string;

when the random value of the program to be detected is larger than a preset threshold value, judging the program to be detected as a malicious program;

the method for presetting the threshold value comprises the following steps: predefining two character string sets, including a non-random character string set and a malicious random character string set, and respectively performing randomness calculation on all character strings in the two character string sets according to a predefined rule, wherein the minimum random value in the non-random character string set is a first random value, and the maximum random value in the malicious random character string set is a second random value; when the first random value is less than the second random value, the threshold is: a second random value; or, an average of the first random value and the second random value; the method for presetting the threshold value further comprises the following steps: predefining a normal random string set, and performing randomness calculation on all the strings in the string set according to the predefined rule, wherein the maximum random value is a third random value, and when the third random value is larger than the first random value and smaller than the second random value, the threshold value is the third random value;

and when the program to be detected is judged to be a malicious program, adding at least one acquired character string at the preset position of the program to be detected into a malicious random character string set.

2. The method as claimed in claim 1, wherein the method of obtaining at least one string of preset positions of the program to be detected comprises: and extracting a character string from at least one of the package name, the signature, the program name, the version number, the file name and the file content of the program to be detected.

3. The method of claim 1, wherein the method of obtaining at least one string of characters at a preset position of a program to be detected comprises: the line feed character is used as an identifier, and a line character is a character string.

4. The method of claim 1, wherein the characters are at least one of english, numerals, symbols, or a mixture thereof.

5. An apparatus for detecting a malicious program, comprising: the acquisition module is used for acquiring at least one character string at a preset position of the program to be detected;

the random value calculation module is used for carrying out randomness calculation on the character string according to a predefined rule to generate a random value of the program to be detected; the method for calculating the randomness of the character string by the random value calculation module and the threshold value calculation module according to the predefined rule comprises the following steps: an N-Gram algorithm; wherein the N-Gram algorithm comprises the following steps:

the comparison and judgment module is used for judging the program to be detected as a malicious program when the random value of the program to be detected is greater than a preset threshold value;

the device also comprises a threshold value calculation module, a judgment module and a judgment module, wherein the threshold value calculation module is used for predefining two character string sets, including a non-random character string set and a malicious random character string set, and respectively carrying out randomness calculation on all character strings in the two character string sets according to a predefined rule, wherein the minimum random value in the non-random character string set is a first random value, and the maximum random value in the malicious random character string set is a second random value; when the first random value is less than the second random value, the threshold is: a second random value; or, an average of the first random value and the second random value; the threshold value calculation module is further configured to predefine a normal random string set, and perform randomness calculation on all strings in the string set according to the predefined rule, where a maximum random value is a third random value; the threshold value is as follows: a third random value;

and when the comparison and judgment module judges that the program to be detected is a malicious program, adding at least one character string of the preset position of the program to be detected, which is acquired by the acquisition module, into the malicious random character string set of the threshold calculation module.

6. The apparatus of claim 5, wherein the method for acquiring at least one character string of the preset position of the program to be detected by the acquisition module comprises: and extracting a character string from at least one of the package name, the signature, the program name, the version number, the file name and the file content of the program to be detected.

7. The apparatus of claim 5, wherein the method for acquiring at least one character string of the preset position of the program to be detected by the acquisition module comprises: the line feed character is used as an identifier, and a line character is a character string.

8. The apparatus of claim 5, wherein the characters are at least one of English, numeric, and symbolic.

9. A non-transitory computer-readable storage medium, wherein instructions, when executed by a processor of a mobile terminal, enable the mobile terminal to perform the method of detecting a malicious program according to claim 1.