CN116346509B - Hard coding certificate detection method, system, equipment and readable storage medium - Google Patents

Hard coding certificate detection method, system, equipment and readable storage medium Download PDF

Info

Publication number
CN116346509B
CN116346509B CN202310636381.1A CN202310636381A CN116346509B CN 116346509 B CN116346509 B CN 116346509B CN 202310636381 A CN202310636381 A CN 202310636381A CN 116346509 B CN116346509 B CN 116346509B
Authority
CN
China
Prior art keywords
character string
string data
string set
calculation result
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310636381.1A
Other languages
Chinese (zh)
Other versions
CN116346509A (en
Inventor
付杰
申晏键
靳岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Biling Technology Co ltd
Beijing Biling Technology Co ltd
Original Assignee
Shanghai Biling Technology Co ltd
Beijing Biling Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Biling Technology Co ltd, Beijing Biling Technology Co ltd filed Critical Shanghai Biling Technology Co ltd
Priority to CN202310636381.1A priority Critical patent/CN116346509B/en
Publication of CN116346509A publication Critical patent/CN116346509A/en
Application granted granted Critical
Publication of CN116346509B publication Critical patent/CN116346509B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/06Network architectures or network communication protocols for network security for supporting key management in a packet data network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/26Testing cryptographic entity, e.g. testing integrity of encryption key or encryption algorithm

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Storage Device Security (AREA)

Abstract

The application relates to the technical field of hard code password detection and provides a hard code credential detection method, a system, equipment and a readable storage medium, wherein the method comprises the steps of obtaining a target file to be detected; determining a first character string set according to a target file to be detected; performing coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is the character string set after sensitive first character string data are screened out; calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result; screening the second character string set according to the first calculation result to obtain a third character string set; and detecting each third character string data in the third character string set, and judging whether the third character string data is a hard coding certificate or not to obtain a first detection result.

Description

Hard coding certificate detection method, system, equipment and readable storage medium
Technical Field
The application relates to the technical field of hard code password detection, in particular to a hard code credential detection method, a system, equipment and a readable storage medium.
Background
The future developer service trend of the internet gradually develops to various forms such as cloud architecture, saaS platform, micro service and the like, the number and diversity of digital identity verification certificates such as passwords are rapidly increasing, the rapid propagation of more and more hard-coded secrets and certificate information stored in a plurality of public code hosting websites forms a great threat to the security of a software supply chain, and an automatic method is used for detecting and protecting keys in a storage library, so that the security problem which is needed to be solved in the era of serious hard-coding problem of the current certificates is needed, and therefore, a hard-coded certificate detection method is needed to realize the accurate detection of the hard-coded certificates.
Disclosure of Invention
The present application aims to provide a hard code credential detection method, system, device and readable storage medium to improve the above-mentioned problems.
In order to achieve the above object, the embodiment of the present application provides the following technical solutions:
in one aspect, an embodiment of the present application provides a method for detecting a hard-coded credential, where the method includes:
acquiring a target file to be detected;
determining a first character string set according to the target file to be detected, wherein the first character string set comprises at least two first character string data;
performing coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is a character string set after sensitive first character string data are screened out;
calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result;
screening the second character string set according to the first calculation result to obtain a third character string set;
and detecting each third character string data in the third character string set, and judging whether the third character string data are hard coding certificates or not to obtain a first detection result.
In a second aspect, embodiments of the present application provide a hard-coded credential detection system, the system comprising:
the acquisition module is used for acquiring the target file to be detected;
the first processing module is used for determining a first character string set according to the target file to be detected, wherein the first character string set comprises at least two first character string data;
the second processing module is used for carrying out coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is a character string set after sensitive first character string data are screened out;
the third processing module is used for calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result;
the fourth processing module is used for screening the second character string set according to the first calculation result to obtain a third character string set;
and the detection module is used for detecting each third character string data in the third character string set and judging whether the third character string data are hard coding certificates or not to obtain a first detection result.
In a third aspect, embodiments of the present application provide a hard-coded credential detection device comprising a memory and a processor. The memory is used for storing a computer program; the processor is configured to implement the steps of the hard-coded credential detection method described above when executing the computer program.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the hard-coded credential detection method described above.
The beneficial effects of the application are as follows:
according to the method, the first character string set is determined through the target file to be detected, sensitive data in the first character string set is subjected to preliminary screening to obtain the second character string set, the aromatic entropy of each piece of second character string data in the second character string set is calculated, the second character string data which are obviously not hard-coded certificates are screened out to obtain the third character string set, and the character string data in the third character string set are detected, so that the detection precision of the hard-coded certificates can be effectively improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a hard-coded credential detection method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a hard-coded credential detection system according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of a hard-coded credential detection device according to an embodiment of the present application.
The drawing is marked: 901. the system comprises an acquisition module, a first processing module and a second processing module, wherein the acquisition module is used for acquiring data; 903. a second processing module; 904. a third processing module; 905. a fourth processing module; 906. a detection module; 9021. a first acquisition unit; 9022. a generating unit; 9023. a first processing unit; 9024. a second processing unit; 9025. a third processing unit; 9051. a second acquisition unit; 9052. a fourth processing unit; 9061. a fifth processing unit; 9062. a sixth processing unit; 90621. a seventh processing unit; 90622. an eighth processing unit; 90623. a ninth processing unit; 90624. a tenth processing unit; 906241, eleventh processing unit; 906242, twelfth processing unit; 906243, thirteenth processing unit; 906244, fourteenth processing unit; 800. a hard-coded credential detection device; 801. a processor; 802. a memory; 803. a multimedia component; 804. an I/O interface; 805. a communication component.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Example 1:
the embodiment provides a hard-coded credential detection method, and it can be understood that a scenario may be laid in this embodiment, for example, the scenario of detecting leakage of credential information (hard-coded credential) by running on Linux, windows and MacOS systems of a mainstream x86—64 architecture machine by split charging of SD client programs into executable files of a PC.
Referring to fig. 1, the method includes a step S1, a step S2, a step S3, a step S4, a step S5, and a step S6, wherein the method specifically includes:
s1, acquiring a target file to be detected;
it can be understood that the target file to be detected is a target file for detecting the leakage condition of the hard-coded certificate.
Step S2, determining a first character string set according to the target file to be detected, wherein the first character string set comprises at least two first character string data;
it can be understood that the step S2 further includes a step S21, a step S22, a step S23, a step S24, and a step S25, where specifically:
s21, acquiring preset key fields;
step S22, generating key fields of the target file to be detected by using an AC automaton;
it will be appreciated that a keyword list of the target file may be generated using the AC automaton, the keyword list including key fields of the target file.
Step S23, judging whether the preset key field is matched with the key field of the target file to be detected generated by the automaton, and obtaining first substring data;
it can be understood that whether the preset key field is matched with the key field in the key word list is judged, and the matched character string data is used as the first sub-character string data.
Step S24, matching key fields of the target file by using a fuzzy key matching rule to obtain second sub-string data;
it can be understood that the key fields of the target file are matched by using the generics rule, and then the successfully matched fields are cleaned to obtain the second substring data, wherein the cleaning process comprises: 1, blank characters such as line changing characters, page changing characters and the like are removed; 2, eliminating single and double quotation marks which appear in pairs at the beginning and the end of the character string data, and needing to be described is that fuzzy matching is performed by utilizing a generics rule which is a technical scheme well known to a person skilled in the art, and the problem that when the field is matched by utilizing a regular expression in the prior art, the key field is unsuccessfully matched due to the existence of special characters in the character string can be effectively avoided by utilizing the generics rule to perform fuzzy matching on the key field.
Step S25, determining the first string set based on the first sub-string data and the second sub-string data.
It can be understood that the first character string set established based on the first sub-character string data and the second sub-character string data can effectively ensure that key fields in the target file to be detected are matched, so that the matching precision is improved, and the detection precision of the hard coding certificate is improved.
Step S3, performing coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is the character string set after sensitive first character string data are screened out;
it will be appreciated that in this step, for initially distinguishing between different hard coded data, the developer may code the mailbox, qq number, cell phone number, identification card information, bank card information, API key, token, etc. in the code. Because the personal sensitive data parts are only directly matched, more complex subsequent judgment is not needed, in the step, the personal sensitive data such as a mailbox, qq number, mobile phone number, identity card information, bank card information and the like are primarily screened by using sensitive information class rules for classifying personal privacy.
S4, calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result;
it will be appreciated that in the prior art, it is generally determined whether the hard-coded data is a hard-coded document by calculating the aromatic entropy of the character string, but since the size of the aromatic entropy of the character string is much more influenced by the length than the cryptographic characteristics of the character string itself, the problem of detection errors easily occurs for short-length and highly complex document data, and therefore, in this step, the purpose of calculating the aromatic entropy of each second character string data is to screen out some character strings that are obviously not document data, and not to determine whether the character string is document data.
Step S5, screening the second character string set according to the first calculation result to obtain a third character string set;
it may be understood that step S5 further includes step S51 and step S52, where specific steps are:
step S51, acquiring a preset first threshold value;
it can be understood that the preset first threshold is 3.3, and the character strings which are obviously not credential data can be effectively screened out by setting a lower first threshold, so as to improve the detection speed of the hard-coded credential data.
Step S52, judging a magnitude relation between the aromatic entropy of each second string data included in the second string set and the preset first threshold, wherein if the aromatic entropy of each second string data included in the second string set is smaller than the preset first threshold, the second string data is filtered out in the second string set; and if the aromatic concentration entropy of each second character string data included in the second character string set is larger than the preset first threshold value, the second character string data are reserved in the second character string set, and a third character string set is obtained.
It can be understood that the third character string set includes character strings which are obviously not credential data, and subsequent detection is needed to determine whether the third character string data included in the third character string set is credential data, and the subsequent detection efficiency can be effectively improved by filtering the character strings which are obviously not credential data.
And S6, detecting each third character string data in the third character string set, and judging whether the third character string data are hard coding certificates or not to obtain a first detection result.
It may be understood that step S6 further includes step S61 and step S62, where specific steps are:
step S61, calculating the password complexity of each third character string data in the third character string set to obtain a second calculation result;
it can be understood that the third character string data is identified to obtain an identification result, wherein the identification result comprises the character string length, the number of digital characters, special characters, uppercase characters and lowercase characters of the third character string; if the third character string data comprises any one of a digital character, a special character, a capital character and a lowercase character, the password complexity is divided into one part; if the character string length of the third character string is greater than 4 and the third character string comprises at least two types of numerical characters, special characters, uppercase characters and lowercase characters, the password complexity is divided into one part; if the character string length of the third character string is greater than 6 and at least three of the numeric characters, the special characters, the uppercase characters and the lowercase characters are included, the password complexity is divided into two parts; if the string length of the third string is greater than 8 and includes numeric characters, special characters, uppercase characters, and lowercase characters, then the password complexity is increased by three minutes, e.g., xc 21-! Is 4 and @ xc234zry is 7.
If all the third character string data is uppercase character, lowercase character, all the data character or character string is formed by repeating a substring for 2 times, the password complexity is reduced by 1 minute.
Step S62, judging the size relation between the second calculation result and a preset second threshold value, wherein if the second calculation result is larger than the preset second threshold value, judging that third character string data is a hard coding certificate, obtaining a first detection result, and filtering the third character string data in the third character string set; if the second calculation result is smaller than the preset second threshold value, the third character string data are reserved in the third character string set to obtain a fourth character string set, the fourth character string data in the fourth character string set are detected, and whether the fourth character string data are hard coding certificates or not is judged to obtain a second detection result.
It can be understood that the step S62 further includes a step S621, a step S622, a step S623, and a step S624, where specifically:
step S621, calculating ASCII code values of each character in the fourth character string data to obtain a third calculation result;
it is understood that the calculation of the ASCII code value of each character in the fourth string data is a technical scheme well known to those skilled in the art, and will not be described herein.
Step S622, determining whether the adjacent characters in the fourth character string data have inflection points according to the third calculation result in sequence, so as to obtain the number of the inflection points of the fourth character string data;
it can be understood that whether the two adjacent characters have inflection points is sequentially determined from left to right until whether the inflection points exist between the last two adjacent characters of the fourth character string data are determined, so as to obtain the number of the inflection points of the fourth character string data.
Step S623, dividing the number of inflection points by the length of the fourth character string data to obtain a fourth calculation result;
it can be understood that, for example, "abcdefg", the character string is completely raised, the inflection point is 0, for example, "gfedcba", the character string is completely lowered, the inflection point is 0, and the fourth calculation result in both cases is 0; if "012435", in which 0-4 rises, 4-3 falls, 3-5 rises, and there are two inflection points in the 3 segments, the fourth calculation result is 2/6=0.33, and it should be noted that, the fourth calculation result is used to represent the continuity value of the string, such as the string "123456", the ascii value shows a complete rise without any inflection point. So the calculation is followed by 0, indicating that the string is not a password. Continuity values are often used to filter out "0123156", "abcdefg.", "1234-5678", etc., which are often written in code by programmers, which are often used by programmers to simply test data, cycle through alphanumeric writes. In credential detection of code, a continuity value calculation algorithm is designed to filter this portion of invalid data, as such data is seen too much.
Step S624, judging the size relation between the fourth calculation result and a preset third threshold value, wherein if the fourth calculation result is larger than the preset third threshold value, judging that the fourth character string data is a hard coding certificate, obtaining a second detection result, and filtering the fourth character string data in the fourth character string set; if the fourth calculation result is smaller than a preset third threshold value, the fourth character string data are reserved in the fourth character string set to obtain a fifth character string set, and the fifth character string data in the fifth character string set are detected to obtain a third detection result.
It can be understood that the step S624 further includes a step S6241, a step S6242, a step S6243, and a step S6244, where specifically:
step S6241, screening the long character string data in the fifth character string set to obtain a sixth character string set;
step S6242, segmenting the sixth character string data in the sixth character string set by utilizing a preset regular rule to obtain at least two seventh character string data;
it can be understood that the sixth string data is cut into a plurality of word sequences, namely, seventh string data by using a regular matching pattern, wherein the specific steps are as follows: 1. uppercase letter beginning to last lowercase letter; 2. the lower case letter starts to the last lower case letter; 3. capital letters begin to the last capital letter in succession.
It should be noted that, the sixth string data is cut into multiple pieces of seventh string data by using the regular matching pattern, where if the number of the seventh string data is greater than six pieces, the string is considered as a key, if the number of the seventh string data is less than six pieces, human readable identification is performed on the seventh string data, and whether the seventh string data is a hard code credential is determined. Because in the test data, it is rarely found that the programmer names a variable longer than 6 words, this non-word is considered as miichaiba kbgqdlatrjrcogo 3 wojggghfhylugdway 9iR3fy4arWNA1KoS8kVw, if cut according to a "regular matching pattern" would cut into 17 segments. And the string is obviously not the variable name that the programmer uses to name the variable.
Character string sequence
Step S6243, performing human readable recognition on the seventh character string data to obtain a recognition result, wherein the recognition result comprises the number of human recognizable words of the seventh character string data;
it can be understood that the human readable recognition calculation is mainly used for filtering false alarm data, and common false alarms are mainly names named for variables by programmers, such as SpiderWeibo, gitlabPersonalAccessToken, and the like, wherein the specific examples are: acquiring a preset dictionary; and matching each seventh character string data with the preset dictionary to obtain a recognition result.
After the words are transcribed, searching and matching the words with the length of more than 2 characters in the dictionary, and if the length of the words is more than or equal to 8, cutting off the matching of the 4 characters so as to avoid the occurrence of substring searching failure caused by long word suffix occurrence and the like.
And step S6244, judging whether the sixth character string data is a hard coding certificate according to the identification result.
It can be understood that the word recognition rate of the seventh character string data is obtained by calculating the number of the human recognizable words and the number of the segments of the seventh character string data according to the seventh character string data; judging whether the sixth character string data is a hard coding certificate or not according to the word recognition rate and a preset word recognition rate threshold value, wherein the method specifically comprises the following steps: setting different thresholds according to the number of the seventh character string data obtained after segmentation, and when the number of the seventh character string data obtained after segmentation is 1, setting the word recognition rate threshold to be 0.9; when the number of the seventh character string data obtained after segmentation is 2, the word recognition rate threshold value is 0.49; when the number of the seventh character string data obtained after segmentation is 3, the word recognition rate threshold value is 0.68; when the number of the seventh character string data obtained after segmentation is 4, the word recognition rate threshold value is 0.76; when the number of the seventh character string data obtained after segmentation is 5, the word recognition rate threshold is 0.7, and when the word recognition rate is larger than the preset word recognition rate threshold, judging that the character string is human readable and is not hard coding credential data; when the word recognition rate is smaller than a preset word recognition rate threshold, judging that the character string is unreadable by human and is hard coding credential data, and setting different thresholds according to the number of different seventh character string data can effectively improve the detection precision of the hard coding credential data and reduce the false alarm rate of the hard coding credential data.
Example 2:
as shown in fig. 2, the present embodiment provides a hard-coded credential detection system, which includes an acquisition module 901, a first processing module 902, a second processing module 903, a third processing module 904, a fourth processing module 905, and a detection module 906, where the specific steps are:
an acquisition module 901, configured to acquire a target file to be detected;
a first processing module 902, configured to determine a first string set according to the target file to be detected, where the first string set includes at least two first string data;
the second processing module 903 is configured to perform coarse screening processing on the first string set to obtain a second string set, where the second string set is a string set after screening out sensitive first string data;
a third processing module 904, configured to calculate a fragrant entropy of each second string data included in the second string set, to obtain a first calculation result;
a fourth processing module 905, configured to screen the second string set according to the first calculation result, to obtain a third string set;
and the detection module 906 is configured to detect each third string data in the third string set, determine whether the third string data is a hard-coded credential, and obtain a first detection result.
In a specific embodiment of the disclosure, the first processing module 902 further includes a first obtaining unit 9021, a generating unit 9022, a first processing unit 9023, a second processing unit 9024, and a third processing unit 9025, where specifically:
a first obtaining unit 9021, configured to obtain a preset key field;
a generating unit 9022, configured to generate key fields of the target file to be detected by using an AC automaton;
a first processing unit 9023, configured to determine whether the preset key field is matched with a key field of the target file to be detected generated by an automaton, to obtain first substring data;
the second processing unit 9024 is configured to match key fields of the target file using a fuzzy key matching rule to obtain second substring data;
a third processing unit 9025 is configured to determine the first string set based on the first sub-string data and the second sub-string data.
In a specific embodiment of the disclosure, the fourth processing module 905 includes a second obtaining unit 9051 and a fourth processing unit 9052, where specifically:
a second obtaining unit 9051, configured to obtain a preset first threshold value;
a fourth processing unit 9052, configured to determine a magnitude relation between a fragrant entropy of each second string data included in the second string set and the preset first threshold, where if the fragrant entropy of each second string data included in the second string set is smaller than the preset first threshold, filter the second string data in the second string set; and if the aromatic concentration entropy of each second character string data included in the second character string set is larger than the preset first threshold value, the second character string data are reserved in the second character string set, and a third character string set is obtained.
In a specific embodiment of the disclosure, the detection module 906 includes a fifth processing unit 9061 and a sixth processing unit 9062, where specific details are:
a fifth processing unit 9061, configured to calculate a password complexity of each third string data in the third string set, to obtain a second calculation result;
a sixth processing unit 9062, configured to determine a size relationship between the second calculation result and a preset second threshold, wherein if the second calculation result is greater than the preset second threshold, determine that third string data is a hard-coded credential, obtain a first detection result, and filter the third string data in the third string set; if the second calculation result is smaller than the preset second threshold value, the third character string data are reserved in the third character string set to obtain a fourth character string set, the fourth character string data in the fourth character string set are detected, and whether the fourth character string data are hard coding certificates or not is judged to obtain a second detection result.
In a specific embodiment of the disclosure, the sixth processing unit 9062 further includes a seventh processing unit 90621, an eighth processing unit 90622, a ninth processing unit 90623, and a tenth processing unit 90624, where specific details are:
a seventh processing unit 90621, configured to calculate an ASCII code value of each character in the fourth string data, to obtain a third calculation result;
an eighth processing unit 90622, configured to sequentially determine, according to the third calculation result, whether there are inflection points in adjacent characters in the fourth string data, to obtain the number of inflection points in the fourth string data;
a ninth processing unit 90623, configured to divide the number of inflection points by the length of the fourth string data to obtain a fourth calculation result;
a tenth processing unit 90624, configured to determine a size relationship between the fourth calculation result and a preset third threshold, and determine that the fourth string data is a hard-coded credential if the fourth calculation result is greater than the preset third threshold, obtain a second detection result, and filter the fourth string data in the fourth string set; if the fourth calculation result is smaller than a preset third threshold value, the fourth character string data are reserved in the fourth character string set to obtain a fifth character string set, and the fifth character string data in the fifth character string set are detected to obtain a third detection result.
In a specific embodiment of the disclosure, the tenth processing unit 90624 includes an eleventh processing unit 906241, a twelfth processing unit 906242, a thirteenth processing unit 906243, and a fourteenth processing unit 906244, where specific details are:
an eleventh processing unit 906241, configured to filter the long string data in the fifth string set to obtain a sixth string set;
a twelfth processing unit 906242, configured to segment sixth string data in the sixth string set by using a preset rule to obtain at least two seventh string data;
a thirteenth processing unit 906243, configured to perform human readable recognition on the seventh string data to obtain a recognition result, where the recognition result includes that the seventh string data is the number of human recognizable words;
the fourteenth processing unit 906244 is configured to determine whether the sixth string data is a hard-coded credential according to the identification result.
It should be noted that, regarding the system in the above embodiment, the specific manner in which the respective modules perform the operations has been described in detail in the embodiment regarding the method, and will not be described in detail herein.
Example 3:
corresponding to the above method embodiment, a hard-coded credential detection device is also provided in this embodiment, and a hard-coded credential detection device described below and a hard-coded credential detection method described above may be referred to correspondingly with each other.
Fig. 3 is a block diagram illustrating a hard-coded credential detection device 800 according to an example embodiment. As shown in fig. 3, the hard-coded credential detection device 800 may include: a processor 801, a memory 802. The hard-coded credential detection device 800 can also include one or more of a multimedia component 803, an I/O interface 804, and a communication component 805.
Wherein the processor 801 is configured to control the overall operation of the hard-coded credential detection device 800 to perform all or part of the steps of the hard-coded credential detection method described above. The memory 802 is used to store various types of data to support the operation of the hard coded credential detection device 800, which may include, for example, instructions for any application or method operating on the hard coded credential detection device 800, as well as application-related data such as contact data, messages, pictures, audio, video, and the like. The Memory 802 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 803 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 802 or transmitted through the communication component 805. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 804 provides an interface between the processor 801 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 805 is configured to perform wired or wireless communication between the hard-coded credential detection device 800 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near FieldCommunication, NFC for short), 2G, 3G or 4G, or a combination of one or more thereof, the respective communication component 805 may thus comprise: wi-Fi module, bluetooth module, NFC module.
In an exemplary embodiment, the hard-coded credential detection device 800 can be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, ASIC), digital signal processor (DigitalSignal Processor, DSP), digital signal processing device (Digital Signal Processing Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor, or other electronic element for performing the hard-coded credential detection method described above.
In another exemplary embodiment, a computer readable storage medium is also provided comprising program instructions which, when executed by a processor, implement the steps of the hard-coded credential detection method described above. For example, the computer readable storage medium may be the memory 802 described above including program instructions executable by the processor 801 of the hard-coded credential detection device 800 to perform the hard-coded credential detection method described above.
Example 4:
corresponding to the above method embodiment, a readable storage medium is further provided in this embodiment, and a readable storage medium described below and a hard-coded credential detection method described above may be referred to correspondingly.
A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the hard-coded credential detection method of the above-described method embodiments.
The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, and the like.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (8)

1. A method for hard-coded credential detection, comprising:
acquiring a target file to be detected;
determining a first character string set according to the target file to be detected, wherein the first character string set comprises at least two first character string data;
performing coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is a character string set after sensitive first character string data are screened out;
calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result;
screening the second character string set according to the first calculation result to obtain a third character string set;
detecting each third character string data in the third character string set, and judging whether the third character string data are hard coding certificates or not to obtain a first detection result;
detecting each third character string data in the third character string set, judging whether the third character string data is a hard coding certificate or not, and obtaining a first detection result, wherein the method comprises the following steps:
calculating the password complexity of each third character string data in the third character string set to obtain a second calculation result;
judging the size relation between the second calculation result and a preset second threshold value, wherein if the second calculation result is larger than the preset second threshold value, judging that third character string data is a hard coding certificate, obtaining a first detection result, and filtering the third character string data in the third character string set; if the second calculation result is smaller than the preset second threshold value, the third character string data are reserved in the third character string set to obtain a fourth character string set, the fourth character string data in the fourth character string set are detected, and whether the fourth character string data are hard coding certificates or not is judged to obtain a second detection result.
2. The method for detecting hard-coded credentials according to claim 1, wherein the screening the second string set according to the first calculation result to obtain a third string set includes:
acquiring a preset first threshold value;
judging the magnitude relation between the aromatic entropy of each second character string data included in the second character string set and the preset first threshold value, wherein if the aromatic entropy of each second character string data included in the second character string set is smaller than the preset first threshold value, the second character string data are filtered in the second character string set; and if the aromatic concentration entropy of each second character string data included in the second character string set is larger than the preset first threshold value, the second character string data are reserved in the second character string set, and a third character string set is obtained.
3. The method of claim 1, wherein determining whether the fourth string data is hard coded credentials to obtain a second detection result comprises
Calculating ASCII code values of each character in the fourth character string data to obtain a third calculation result;
sequentially determining whether the adjacent characters in the fourth character string data have inflection points according to the third calculation result to obtain the number of the inflection points of the fourth character string data;
dividing the number of inflection points by the length of the fourth character string data to obtain a fourth calculation result;
judging the size relation between the fourth calculation result and a preset third threshold value, wherein if the fourth calculation result is larger than the preset third threshold value, judging that the fourth character string data is a hard coding certificate, obtaining a second detection result, and filtering the fourth character string data in the fourth character string set; if the fourth calculation result is smaller than a preset third threshold value, the fourth character string data are reserved in the fourth character string set to obtain a fifth character string set, and the fifth character string data in the fifth character string set are detected to obtain a third detection result.
4. A hard-coded credential detection system, comprising:
the acquisition module is used for acquiring the target file to be detected;
the first processing module is used for determining a first character string set according to the target file to be detected, wherein the first character string set comprises at least two first character string data;
the second processing module is used for carrying out coarse screening processing on the first character string set to obtain a second character string set, wherein the second character string set is a character string set after sensitive first character string data are screened out;
the third processing module is used for calculating the aromatic entropy of each second character string data included in the second character string set to obtain a first calculation result;
the fourth processing module is used for screening the second character string set according to the first calculation result to obtain a third character string set;
the detection module is used for detecting each third character string data in the third character string set and judging whether the third character string data are hard coding certificates or not to obtain a first detection result;
wherein, detection module includes:
a fifth processing unit, configured to calculate a password complexity of each third string data in the third string set, to obtain a second calculation result;
a sixth processing unit, configured to determine a size relationship between the second calculation result and a preset second threshold, where if the second calculation result is greater than the preset second threshold, determine that third string data is a hard-coded credential, obtain a first detection result, and filter the third string data in the third string set; if the second calculation result is smaller than the preset second threshold value, the third character string data are reserved in the third character string set to obtain a fourth character string set, the fourth character string data in the fourth character string set are detected, and whether the fourth character string data are hard coding certificates or not is judged to obtain a second detection result.
5. The hard-coded credential detection system of claim 4, wherein the fourth processing module comprises:
the second acquisition unit is used for acquiring a preset first threshold value;
a fourth processing unit, configured to determine a magnitude relation between a fragrant entropy of each second string data included in the second string set and the preset first threshold, where if the fragrant entropy of each second string data included in the second string set is smaller than the preset first threshold, the second string data is filtered out in the second string set; and if the aromatic concentration entropy of each second character string data included in the second character string set is larger than the preset first threshold value, the second character string data are reserved in the second character string set, and a third character string set is obtained.
6. The hard coded credential detection system of claim 4, wherein the sixth processing unit comprises
A seventh processing unit, configured to calculate an ASCII code value of each character in the fourth string data, to obtain a third calculation result;
an eighth processing unit, configured to sequentially determine whether inflection points exist in adjacent characters in the fourth string data according to the third calculation result, to obtain the number of inflection points of the fourth string data;
a ninth processing unit, configured to divide the number of inflection points by the length of the fourth string data to obtain a fourth calculation result;
a tenth processing unit, configured to determine a size relationship between the fourth calculation result and a preset third threshold, where if the fourth calculation result is greater than the preset third threshold, determine that the fourth string data is a hard-coded credential, obtain a second detection result, and filter the fourth string data in the fourth string set; if the fourth calculation result is smaller than a preset third threshold value, the fourth character string data are reserved in the fourth character string set to obtain a fifth character string set, and the fifth character string data in the fifth character string set are detected to obtain a third detection result.
7. A hard-coded credential detection device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the hard-coded credential detection method as claimed in any one of claims 1 to 3 when said computer program is executed.
8. A readable storage medium, characterized by: a computer program stored on a readable storage medium, which when executed by a processor, implements the steps of the hard-coded credential detection method as claimed in any one of claims 1 to 3.
CN202310636381.1A 2023-06-01 2023-06-01 Hard coding certificate detection method, system, equipment and readable storage medium Active CN116346509B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310636381.1A CN116346509B (en) 2023-06-01 2023-06-01 Hard coding certificate detection method, system, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310636381.1A CN116346509B (en) 2023-06-01 2023-06-01 Hard coding certificate detection method, system, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN116346509A CN116346509A (en) 2023-06-27
CN116346509B true CN116346509B (en) 2023-08-15

Family

ID=86880855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310636381.1A Active CN116346509B (en) 2023-06-01 2023-06-01 Hard coding certificate detection method, system, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116346509B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399848A (en) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 Hard coded data detection method and device, electronic equipment and medium
CN111552640A (en) * 2020-04-24 2020-08-18 北京字节跳动网络技术有限公司 Code detection method, device, equipment and storage medium
CN114547590A (en) * 2020-11-25 2022-05-27 中国电信股份有限公司 Code detection method, device and non-transitory computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150178264A1 (en) * 2013-12-24 2015-06-25 Ca, Inc. Reporting the presence of hardcoded strings on a user interface (ui)
WO2020257973A1 (en) * 2019-06-24 2020-12-30 Citrix Systems, Inc. Detecting hard-coded strings in source code

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399848A (en) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 Hard coded data detection method and device, electronic equipment and medium
CN111552640A (en) * 2020-04-24 2020-08-18 北京字节跳动网络技术有限公司 Code detection method, device, equipment and storage medium
CN114547590A (en) * 2020-11-25 2022-05-27 中国电信股份有限公司 Code detection method, device and non-transitory computer readable storage medium

Also Published As

Publication number Publication date
CN116346509A (en) 2023-06-27

Similar Documents

Publication Publication Date Title
KR102271449B1 (en) Artificial intelligence model platform and operation method thereof
CN107992741B (en) Model training method, URL detection method and device
CN111159697B (en) Key detection method and device and electronic equipment
CN110933104B (en) Malicious command detection method, device, equipment and medium
CN111399848B (en) Hard-coded data detection method and device, electronic equipment and medium
CN111133396B (en) Production facility monitoring device, production facility monitoring method, and recording medium
CN116366377B (en) Malicious file detection method, device, equipment and storage medium
TWI740086B (en) Domain name recognition method and domain name recognition device
CN110719278A (en) Method, device, equipment and medium for detecting network intrusion data
CN110855635A (en) URL (Uniform resource locator) identification method and device and data processing equipment
CN116346509B (en) Hard coding certificate detection method, system, equipment and readable storage medium
CN113886832A (en) Intelligent contract vulnerability detection method, system, computer equipment and storage medium
CN113378161A (en) Security detection method, device, equipment and storage medium
US11222113B1 (en) Automatically generating malware definitions using word-level analysis
CN107241342A (en) A kind of network attack crosstalk detecting method and device
CN115114627B (en) Malicious software detection method and device
KR101526500B1 (en) Suspected malignant website detecting method and system using information entropy
CN114239493A (en) Data interception method and device
CN112367336B (en) Webshell interception detection method, device, equipment and readable storage medium
CN114417866A (en) Text security level judgment method and device and electronic equipment
CN106657016A (en) Illegal user name recognition method and system
CN112559497A (en) Data processing method, information transmission method and device and electronic equipment
CN112698883A (en) Configuration data processing method, device, terminal and storage medium
CN113591440B (en) Text processing method and device and electronic equipment
CN111478877B (en) Domain name recognition method and domain name recognition device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant