CN108647511B - Password strength evaluation method based on weak password derivation - Google Patents

Password strength evaluation method based on weak password derivation Download PDF

Info

Publication number
CN108647511B
CN108647511B CN201810324327.2A CN201810324327A CN108647511B CN 108647511 B CN108647511 B CN 108647511B CN 201810324327 A CN201810324327 A CN 201810324327A CN 108647511 B CN108647511 B CN 108647511B
Authority
CN
China
Prior art keywords
password
weak
grammar
probability
weak password
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810324327.2A
Other languages
Chinese (zh)
Other versions
CN108647511A (en
Inventor
何道敬
周贝贝
吴宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201810324327.2A priority Critical patent/CN108647511B/en
Publication of CN108647511A publication Critical patent/CN108647511A/en
Application granted granted Critical
Publication of CN108647511B publication Critical patent/CN108647511B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/45Structures or tools for the administration of authentication
    • G06F21/46Structures or tools for the administration of authentication by designing passwords or checking the strength of passwords

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a password strength evaluation method based on weak password derivation, which comprises the following steps: 1) weak password set generation: selecting passwords with the top rank from the password samples in a descending order of occurrence frequency as a weak password set; 2) and (3) grammar training: analyzing the password in the training set based on the weak password set to generate a probability context-free grammar table with a weak password label; 3) password strength evaluation: inputting a password, calculating the probability of the password according to a grammar table generated by grammar training, wherein the higher the probability value is, the lower the strength of the password is; 4) and (3) grammar table updating: and dynamically adjusting the probability distribution of the probability context independent grammar with the weak password label according to the input password. The method utilizes the existing probability context-free grammar to deduce the password similar to the password in the weak password set, inherits the efficiency and robustness of the traditional password strength evaluation method, and simultaneously can eliminate the potential weak password, enhance the password resistance to the password guessing attack and improve the safety of users.

Description

Password strength evaluation method based on weak password derivation
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a password strength evaluation method based on weak password derivation.
Background
The rapid development of internet technology has profoundly changed the way of learning, working and living of people, and in recent years, information technology represented by mobile internet and electronic commerce greatly facilitates the life of people. The information security issues closely related to the internet are also receiving more and more attention. Identity authentication is an important way to protect the security of user information, and is widely applied to various service sites in the internet.
Identity authentication is a main means for protecting the security of user information. Password authentication is the most widely used identity authentication method in the internet because of its characteristics such as convenient deployment and flexible use. Password-based authentication systems, however, suffer from a number of security and usability problems. In password authentication systems, the system requires the user to create a printable string (i.e., a password) and use this string as a means of verifying the user's identity. Due to the limited memory of the human brain, it is difficult for human beings to remember complex and secure passwords, and the trend is often to use simple passwords. The use of a simple password may lead to vulnerability of the password authentication system.
A good password strength evaluator should be able to characterize the similarity between weak passwords. Such as password 123456, is a recognized weak password, and passwords 123.456 and 123456 have very high similarity, giving good reason to consider that users have a high likelihood of constructing a new password 123.456 based on password 123456.
However, the academic world is a leading password strength evaluation method based on the PCFG algorithm, and this cannot be judged. The PCFG model classifies the user password characters into three categories, letter (L), number (D), and special character (S), and assumes that the user generates the password by means of "concatenation".
Therefore, the combination of the traditional probabilistic context-free grammar and the weak password commonly used by each website has great significance for the evaluation of the weak password which is judged to be 'robust' but actually unsafe by the existing password strength evaluator.
Disclosure of Invention
The invention aims to make up the defects of the existing password strength evaluation method, combines the traditional probabilistic context-free grammar and the weak password set, provides the password strength evaluation method deduced by using the weak password set, and can identify more weak passwords which are misjudged as 'robust' while inheriting the efficiency and robustness of the traditional password strength evaluation method, thereby enhancing the capability of the password in resisting password guessing attack and improving the security of the password.
The specific technical scheme for realizing the purpose of the invention is as follows:
a password strength evaluation method based on weak password derivation comprises the following specific steps:
step 1: weak password set generation
Selecting passwords with the top rank from the password samples in a descending order of occurrence frequency as a weak password set;
step 2: grammar training
Analyzing the password in the training set based on the weak password set to generate a probability context-free grammar table with a weak password label;
and step 3: password strength evaluation
Inputting a password, calculating the probability of the password according to a grammar table generated by grammar training, wherein the higher the probability value is, the lower the strength of the password is;
and 4, step 4: grammar table update
And dynamically adjusting the probability distribution of the probability context independent grammar with the weak password label according to the input password.
The step 2 of the invention specifically comprises the following steps:
step A1: weak password matching
Carrying out similarity matching on the passwords or substrings thereof in the training set and the passwords in the weak password set for next password structure analysis;
if the substrings of the password in the training set are successfully matched with the password in the weak password set, continuing to execute the matching process on the rest unmatched parts in the password until all the substrings of the password are matched once, and finally returning an optimal value sequence;
step A2: password structure resolution
Firstly, marking the optimal value sequence returned in the step A1 by using a weak password label; the remaining part which can not be matched is matched by using the original probability context-free grammar label until the analysis of the whole password is finally completed;
step A3: grammar table generation
When all the passwords in the training set are analyzed, generating a probability context grammar-free table with weak password labels;
wherein: the algorithm used for the similarity matching includes, but is not limited to, a bk-tree.
The step A1 specifically comprises the following steps:
step A11: setting an editing distance threshold and a similarity threshold;
step A12: acquiring all password substrings to be analyzed, namely corresponding weak password character string pairs, of which the editing distance is less than or equal to an editing distance threshold and the similarity is greater than or equal to a similarity threshold;
step A13: obtaining all character string pairs with the minimum editing distance on the basis of A12;
step A14: obtaining all character string pairs with the maximum similarity on the basis of A13;
step A15: obtaining all character string pairs with the maximum weak password length on the basis of A14;
step A16: if the set formed by all the character string pairs obtained by A15 is empty, the matching failure of the password to be analyzed and the password in the weak password set is represented; if not, the matching between the password to be analyzed and the password in the weak password set is successful, and one character string pair is randomly selected from the set formed by the character string pairs to be used as the optimal solution to return.
The original probabilistic context-free grammar label is divided into: numbers, letters, special characters.
The probabilistic context-free grammar with the weak password label in step 2 of the present invention includes, but is not limited to, a non-final character set, a starting variable and a rule set.
The elements in the non-terminal character of the present invention include, but are not limited to: alphabetic characters, numeric characters, special characters, keyboard continuation, insert operations, delete operations, replace operations, and weak password strings.
Step 4 of the present invention specifically includes:
step B1: determining a structure of adding 1 to the frequency according to the input password;
step B2: adding 1 to the total number of structures in the grammar table;
step B3: updating the probability of the structure in step B1;
step B4: and updating the probabilities of other structures in the grammar table in sequence.
The method is based on the existing weak password set and combines a probability text context free text method, and more similar passwords are deduced by the password in the weak password set, so that the probability of the similar passwords with the password in the weak password set is effectively calculated, the efficiency and the robustness of the traditional password strength evaluation method are inherited, the capability of the password in resisting the password guessing attack is enhanced, and the precision of the password strength evaluation method is improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flowchart illustrating the matching of a password in a training set with a password in a weak password set according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following specific examples and the accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.
Examples
The technical terms in this example represent the following meanings:
PCFG: probabilistic Context-Free Grammar (Probasilic Context Free Grammar)
W weak password set
W is an element of W
Wn password with length n in weak password set (Lmin is less than or equal to n and less than or equal to Lmax)
Lmax maximum length of password string the target system allows to receive
Lmin minimum length of password string allowed to be received by target system
T: training set
OLCS (optimal Long Common subsequence): optimal Longest Common subsequence algorithm
pw password to be resolved
Set of all SUB-strings of SUB pw
Cartesian product of SUB x W, SUB and W
Elements of SUB
DT: editing distance thresholds
ST: similarity threshold
V ═ Start, a, L, U, D, S, K, insert, delete, replace, no, W1, W2.. Wn, is a set of non-terminators, the elements of which are referred to as non-terminators
Σ ═ 95 printable ASCII characters, a terminator set disjoint from V, the elements of which are called terminators
Start is a subset of V, called the set of initial variables
P is a set of rules, the elements being called rules, the shape being α → β, where α is a non-terminal and β is composed of non-terminal and terminal
A is An alphabetic character, An represents n continuous alphabetic characters
L, U, letter character mask, wherein L represents lower case letters and U represents upper case letters
D, numerical characters, Dn represents n successive numerical characters
S special character, SnRepresenting n successive special characters
K, keyboard continuous character, Kn represents n keyboard continuous characters (n is more than or equal to 4)
insert operation on weak password centralized password
delete operation on weak password set password
replace general replacement operation for weak password set password
no operation on weak password centralized password
Referring to fig. 1, the present embodiment includes the following steps:
step 1: weak password set generation
Selecting passwords with the top rank from the password samples in a descending order of occurrence frequency as a weak password set;
step 2: grammar training
Analyzing the password in the training set based on the weak password set to generate a probability context-free grammar table with a weak password label; the method specifically comprises the following steps:
step A1: weak password matching
And carrying out similarity matching on the passwords pw or substrings thereof in the training set and the passwords in the weak password set W for the next password structure analysis.
If the substrings of the password in the training set are successfully matched with the password in the weak password set, the matching process is continuously executed on the residual unmatched part (pw-sub) in the password until all the substrings of the password are matched once, and finally, an optimal value sequence opt is returned1,opt2,...optn)。
Step A2: password structure resolution
Firstly, marking each substring of pw with the ops returned in the step A1, and if substrings sub of pw are matched with W with the length of n in W, marking sub as Wn; and matching the substrings which are not matched with the password in the W and are left by the pw by using an LDS label of the original probability context-free grammar until the resolution of the whole password is finally completed.
Step A3: grammar table generation
When all the passwords in the training set are analyzed, a probability context-free grammar table with weak password labels is generated.
The similarity matching algorithm used in step A1 includes, but is not limited to, a bk-tree.
Referring to fig. 2, a distance function is used to determine the edit distance between two strings, a smility function is used to determine the similarity between two strings, and a len function is used to determine the length of a string. The specific process of the step A1 is as follows:
the process of matching the similarity between the password pw and the password in the weak password set W is as follows:
step A11: setting an editing distance threshold DT and a similarity threshold ST;
step A12: obtaining all password substrings to be analyzed, which have the editing distance less than or equal to DT and the similarity more than or equal to ST, and corresponding weak password character string pairs (SUB, W), wherein (SUB, W) belongs to SUB multiplied by W;
step A13: obtaining all character string pairs (sub, w) with the minimum editing distance on the basis of A12;
step A14: obtaining all character string pairs (sub, w) with the maximum similarity on the basis of A13;
step A15: obtaining all character string pairs (sub, w) with the maximum weak password length on the basis of A14;
step A16: if the set formed by all the character string pairs obtained by A15 is empty, the matching of the password pw and the password in the weak password set W is failed; if not, it indicates that the password pw matches successfully with the passwords in the weak password set W, and randomly selects one string pair (SUB, W) from the set of string pairs as the optimal solution opt (opt ═ SUB, W), and opt ∈ SUB × W is returned.
The original probability context-free grammar label is divided into: numbers, letters, special characters.
The probabilistic context-free grammar G with weak password labels includes, but is not limited to, a non-terminal character set, a starting variable, and a rule set.
Elements in the non-terminal character include, but are not limited to: alphabetic characters, numeric characters, special characters, keyboard continuation, insert operations, delete operations, replace operations, weak password strings.
Such as the password avai ^ able123 ∈ T, and available ∈ W. The password structure directly analyzed by a PCFG matching method is L4S1L5D 3; the avail ^ able is most similar to the weak password avail (the editing distance is shortest, the similarity is maximum, and the matching length is longest), so the (avail ^ able and avail) is used as an optimal value, and the (avail ^ able and avail) is directly returned because only one optimal value sequence is provided.
And step 3: password strength evaluation
Inputting a password, calculating the probability of the password according to a grammar table generated by grammar training, wherein the higher the probability value is, the lower the strength of the password is;
if the entered password is 123.456 and 123456 ∈ W, the probability of W6 is 0.28, W6Probability of 0.4, W → 1234566Probability of → insert is 0.3, probability of insert → S1 is 0.11, S1→ 0.52, 123.456 would be identified as 123456 (structure W)6) Insert a special character ". The probability of password 123.456 is therefore: p (123.456) ═ P (Start → W6) × P (W6 → 123456) × P (W6 → insert) × insert → S1) × P (S1 →.)
=0.28*0.4*0.3*0.11*0.52
≈0.00192。
And 4, step 4: grammar table update
Dynamically adjusting the probability distribution of the probability context irrelevant grammar with the weak password label according to the input password; the method specifically comprises the following steps:
step B1: determining a structure of adding 1 to the frequency according to the input password;
step B2: adding 1 to the total number of structures in the grammar table;
step B3: updating the probability of the structure in step B1;
step B4: and updating the probabilities of other structures in the grammar table in sequence.
Let the grammar table total N structures, the total number of all structures present is N. The probability P of the ith structure occurringi=fiN, wherein fiThe frequency of occurrence of the ith structure. When a password is newly registered, the frequency count of the structure corresponding to i after the password is registered is added with 1, and the total number N of the structures is also added with 1.
The probability of the ith structure is updated to
Pi'=(fi+1)/(N+1) (1)
The probabilities of other structures are also updated to
Pj'=fj/(N+1),j≠i (2)
For example, when the user registers a password 123.456abc, the portion 123.456 of the password 123.456abc is determined to be similar to the weak password 123456 and labeled 123.456 as W6. The remaining part abc, which has no corresponding weak password to match, will be labeled L3 according to the PCFG segmentation method, so the password 123.456abc will be identified and labeled W6L3
The structure W associated with the password6L3、L3、W6A terminal string 123.456abc, and a rule W6→insert,insert→.W6The probability of → 123456 is updated according to equation (1) and the probabilities of other structures are adjusted accordingly according to equation (2).
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected.

Claims (5)

1. A password strength evaluation method based on weak password derivation is characterized by comprising the following specific steps:
step 1: weak password set generation
Selecting passwords with the top rank from the password samples in a descending order of occurrence frequency as a weak password set;
step 2: grammar training
Analyzing the password in the training set based on the weak password set to generate a probability context-free grammar table with a weak password label;
and step 3: password strength evaluation
Inputting a password, calculating the probability of the password according to a grammar table generated by grammar training, wherein the higher the probability value is, the lower the strength of the password is;
and 4, step 4: grammar table update
Dynamically adjusting the probability distribution of the probability context irrelevant grammar with the weak password label according to the input password; wherein:
the step 2 specifically comprises:
step A1: weak password matching
Carrying out similarity matching on the passwords or substrings thereof in the training set and the passwords in the weak password set for next password structure analysis;
if the substrings of the password in the training set are successfully matched with the password in the weak password set, continuing to execute the matching process on the rest unmatched parts in the password until all the substrings of the password are matched once, and finally returning an optimal value sequence;
step A2: password structure resolution
Firstly, marking the optimal value sequence returned in the step A1 by using a weak password label; the remaining part which can not be matched is matched by using the original probability context-free grammar label until the analysis of the whole password is finally completed;
step A3: grammar table generation
When all the passwords in the training set are analyzed, generating a probability context grammar-free table with weak password labels;
wherein: the algorithms used for similarity matching include, but are not limited to, bk-tree;
the step A1 specifically comprises the following steps:
step A11: setting an editing distance threshold and a similarity threshold;
step A12: acquiring all password substrings to be analyzed, namely corresponding weak password character string pairs, of which the editing distance is less than or equal to an editing distance threshold and the similarity is greater than or equal to a similarity threshold;
step A13: obtaining all character string pairs with the minimum editing distance on the basis of A12;
step A14: obtaining all character string pairs with the maximum similarity on the basis of A13;
step A15: obtaining all character string pairs with the maximum weak password length on the basis of A14;
step A16: if the set formed by all the character string pairs obtained by A15 is empty, the matching failure of the password to be analyzed and the password in the weak password set is represented; if not, the matching between the password to be analyzed and the password in the weak password set is successful, and one character string pair is randomly selected from the set formed by the character string pairs to be used as the optimal solution to return.
2. The weak password derivation-based password strength evaluation method of claim 1, wherein the original probabilistic context-free grammar tag is divided into: numbers, letters, special characters.
3. The password strength evaluation method based on weak password derivation as claimed in claim 1, wherein the weak password tagged probabilistic context-free grammar in step 2 includes but is not limited to a non-final character set, a starting variable and a rule set.
4. The password strength evaluation method based on weak password derivation as claimed in claim 3 wherein elements in the non-terminal character include but are not limited to: alphabetic characters, numeric characters, special characters, keyboard continuation, insert operations, delete operations, replace operations, and weak password strings.
5. The password strength evaluation method based on weak password derivation as claimed in claim 1, wherein said step 4 specifically comprises:
step B1: determining a structure of adding 1 to the frequency according to the input password;
step B2: adding 1 to the total number of structures in the grammar table;
step B3: updating the probability of the structure in step B1;
step B4: and updating the probabilities of other structures in the grammar table in sequence.
CN201810324327.2A 2018-04-12 2018-04-12 Password strength evaluation method based on weak password derivation Active CN108647511B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810324327.2A CN108647511B (en) 2018-04-12 2018-04-12 Password strength evaluation method based on weak password derivation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810324327.2A CN108647511B (en) 2018-04-12 2018-04-12 Password strength evaluation method based on weak password derivation

Publications (2)

Publication Number Publication Date
CN108647511A CN108647511A (en) 2018-10-12
CN108647511B true CN108647511B (en) 2022-04-05

Family

ID=63746260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810324327.2A Active CN108647511B (en) 2018-04-12 2018-04-12 Password strength evaluation method based on weak password derivation

Country Status (1)

Country Link
CN (1) CN108647511B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344604B (en) * 2018-10-23 2020-12-25 杭州安恒信息技术股份有限公司 Method and system for judging password risk of user based on user habit
CN110336921B (en) * 2019-07-09 2021-01-15 华中师范大学 Android graph password strength measurement method and system
US11625477B2 (en) * 2020-08-13 2023-04-11 Capital One Services, Llc Automated password generation
CN112199214B (en) * 2020-10-13 2023-12-01 中国科学院信息工程研究所 Candidate password generation and application cracking method on GPU
CN112632526B (en) * 2021-01-07 2022-04-12 复旦大学 User password modeling and strength evaluation method based on comprehensive segmentation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570391A (en) * 2016-11-10 2017-04-19 中国科学院信息工程研究所 Memory block based password guessing set generation method and memory block based digital password cracking method
CN106803035A (en) * 2016-11-30 2017-06-06 中国科学院信息工程研究所 A kind of password conjecture set creation method and password cracking method based on username information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570391A (en) * 2016-11-10 2017-04-19 中国科学院信息工程研究所 Memory block based password guessing set generation method and memory block based digital password cracking method
CN106803035A (en) * 2016-11-30 2017-06-06 中国科学院信息工程研究所 A kind of password conjecture set creation method and password cracking method based on username information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于概率上下文无关文法的口令强度评估方法;陈颖等;《物联网技术》;20170430(第4期);第59-61页 *

Also Published As

Publication number Publication date
CN108647511A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN108647511B (en) Password strength evaluation method based on weak password derivation
US9524393B2 (en) System and methods for analyzing and modifying passwords
EP2585962B1 (en) Password checking
CN109005145B (en) Malicious URL detection system and method based on automatic feature extraction
CN107122479B (en) User password guessing system based on deep learning
CN108629174B (en) Method and device for checking character strings
WO2017106669A1 (en) Systems and methods evaluating password complexity and strength
CN106484132B (en) Input error correction method and input method device
WO2019038755A1 (en) Domain impersonator identification system
CN110674370A (en) Domain name identification method and device, storage medium and electronic equipment
CN111797217A (en) Information query method based on FAQ matching model and related equipment thereof
US20100125725A1 (en) Method and system for automatically detecting keyboard layout in order to improve the quality of spelling suggestions and to recognize a keyboard mapping mismatch between a server and a remote user
Hong et al. Enhanced evaluation model of security strength for passwords using integrated korean and english password dictionaries
CN109359481A (en) It is a kind of based on BK tree anti-collision search about subtract method
CN111538893B (en) Method for extracting network security new words from unstructured data
CN106778568A (en) The processing method of the identifying code based on WEB page
CN112632526B (en) User password modeling and strength evaluation method based on comprehensive segmentation
Sonowal A model for detecting sounds-alike phishing email contents for persons with visual impairments
JP2001134491A (en) System for supporting selection of password
CN115563604A (en) Password strength evaluation method and system based on deep neural network and feature fusion
CN113111329B (en) Password dictionary generation method and system based on multi-sequence long-term and short-term memory network
Sern et al. TypoSwype: An imaging approach to detect typo-squatting
CN111027325B (en) Model generation method, entity identification device and electronic equipment
US20210209504A1 (en) Learning method, learning device, and learning program
Wang et al. Template Protection based on Chaotic Map and DNA Encoding for Multimodal Biometrics at Feature Level Fusion.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant