CN113886784A - Password guessing method for improving guessing efficiency of small training set based on corpus - Google Patents

Password guessing method for improving guessing efficiency of small training set based on corpus Download PDF

Info

Publication number
CN113886784A
CN113886784A CN202111478071.9A CN202111478071A CN113886784A CN 113886784 A CN113886784 A CN 113886784A CN 202111478071 A CN202111478071 A CN 202111478071A CN 113886784 A CN113886784 A CN 113886784A
Authority
CN
China
Prior art keywords
password
corpus
guessing
rule
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111478071.9A
Other languages
Chinese (zh)
Other versions
CN113886784B (en
Inventor
甘晓春
陈猛
陈虎
李东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202111478071.9A priority Critical patent/CN113886784B/en
Publication of CN113886784A publication Critical patent/CN113886784A/en
Application granted granted Critical
Publication of CN113886784B publication Critical patent/CN113886784B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/45Structures or tools for the administration of authentication

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a password guessing method for improving guessing efficiency of a small training set based on a corpus, and relates to the technical field of data processing and prediction. The method comprises the following steps: constructing a corpus gamma; based on corpus Γ, training results are generated for training password set PWD _ TRAIN: the probability q (R) of each rule R in the password guessing rule set R, and the probability p (w) of each vocabulary w in Γ; generating a dictionary D (S) with guess times S according to the training result and the corpus gamma; detecting D (S) the rate of cracking the TEST password set PWD _ TEST. The invention can effectively improve the cracking rate of the test password set when the training set is smaller by expanding the vocabulary in the training set PWD _ TRAIN through the corpus gamma.

Description

Password guessing method for improving guessing efficiency of small training set based on corpus
Technical Field
The invention relates to the technical field of data processing and prediction, in particular to a password guessing method for improving guessing efficiency of a small training set based on a corpus.
Background
The basic method of password guessing is to try the password that the user may use until the correct password is found or a predetermined number of guesses is reached and the guess is discarded. Therefore, to improve the efficiency of guessing, it is necessary to guess the password with a higher possibility of use by the user with priority. The existing password guessing method mainly comprises the following steps: violence, roller compaction, Markov models, Probabilistic Context Free Grammar (PCFG), etc.
Brute force is the most traditional password guessing method, and the main defect is that the length of the password which can be guessed is short. Because of the total number of guesses limitation, the length of brute force guesses for full keyboard characters tends to not exceed 9 characters, and the length of brute force guesses containing only lowercase letters and numbers tends to not exceed 11 characters.
The dictionary transformation method (Emin Islam Tath, "Cracking more passwords hashes with patterns", IEEE Trans. on Information forms and Security, vol.10, No.8, pp.1656-1665, 2015.) refers to transforming a source password into a password to be guessed according to a password transformation rule (e.g., rockyou-30000 rule base in olchashcat). This password guessing method is very common in practice, but its validity depends on the source password set, and a valid guess cannot be done for a password that does not appear in the source password set.
The Markov model method (Jerry Ma, Weining Yang, Min Luo, Ninghui Li, "A study of probabilic passswerds," in Proc. IEEE Symposium on Security and Privacy, pp.689-704, 2014 Markus Durmuth, Fabian Angelischerltorf, Claude Casteluci, Daniele Perito, Abdelberi Chamabane, "OMEN: Faster passing using an ordered Markov organ", in Proc. the 7th Symposium on ESSoS, pp.119-132, 2015 password) is to establish a transition probability matrix between letters in a training password set and predict the probability of a certain letter accordingly. The method has the greatest characteristics that the method does not depend on a corpus set, can independently find common words in the password, and can effectively process common deformation forms in the words. But has the disadvantage of requiring a high-order Markov process to "remember" longer lexical content and the semantics are not well defined.
The PCFG method (Matt Weir, Sudhir Afflawa, Breno de Medeeros, Bill Glodek, "passing trading using basic textual context-free grams", in Proc. 30th IEEE Symposium on Security and Privacy, 2009, pp.391-405.) is divided into a training phase and a guessing phase. In the training phase, the passwords of the training set are segmented according to character types, and the probability of each structure and the probability of each vocabulary are generated in a statistical mode. For example, the password "spring 2021!" corresponds to a structure [ A6] [ D4] [ S1], in which A6 represents a character string of 6 letters, D4 represents 4 numerals, and S1 represents 1 special symbol. If 10 out of 1000 training passwords have the structure of [ A6] [ D4] [ S1], the probability of this structure is 0.01. The PCFG method assumes that the probability of a character string becoming a password = the structure probability of the character string × the probability of each word in the character string. In the guessing phase, a NEXT algorithm is used to generate a sequence of password strings to be guessed from high to low in probability. In the modified PCFG method (Shiva Houshmand, Sudhir Aggarwal, Randy Flood, "Next Gen PCFG passive cracking", IEEE trans. on Information dynamics and Security, vol.10, No.8, pp.1776-1791, 2015.), further keyboard string sets are added, and Laplace smoothing is performed on the vocabulary frequency of the corpus. The method makes up the limitation of word segmentation according to character types in the original PCFG method to a certain extent, and can further enrich the content of the corpus. Although the PCFG method produces dictionaries at a slower rate, the guessing efficiency of the method can be effectively estimated using the Monte Carlo sampling method (Dell' Amico, M. & Filippone, M., Monte Carlo Strength Evaluation: Fast and Reliable Passsword Checking, Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, ACM, 2015, 158-.
The overseas scholars (Ji, S.; ang, S.; Hu, X.; Han, W.; Li, Z. & Beyah, R.; Zero-Sum passage Cracking Game: A Large-Scale-dimensional Empirical Study ON the Cracking activity, correction, AND curing of passages, IEEE transaction ON DEPENDABLE AND SECURE COMPUTING, 2017, 14, 550-564. Ur, B.; Segreti, S. M.; Bauer, L.; Christin, N.; Cranor, L. F.; Komanuri, S.; Kurilova, D.; Mazurek †, M. L.; Meelic, W.; R., Shachuring, R., reading-testing, L., Komani S.; Kurilour, D.; M. L., Systemma, Q.M., Q., simulation, AND Q.M., the best evaluation method of PCfg, such as the PCyield, P. Q., but also to different language types. Therefore, the PCFG method has gradually become the mainstream method of password guessing academic research. Furthermore, the PCFG method can also be used for directed attacks (Wang, D.; Zhang, Z.; Wang, P.; Yan, J. & Huang, X., Targeted one passed approval, Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, ACM, 2016.), i.e., to generate a set of guessed passwords from a combination of personal information of a user.
The PCFG method integrates password information of two levels of structure and corpus and has higher efficiency. It still has significant limitations. Mainly expressed in the following aspects: 1) the description of the password structure only adopts character types as marks for distinguishing the password vocabularies, and the password formed by a plurality of vocabularies is difficult to distinguish, for example, the password "ilovemike" is all lower case letters, and is used as a vocabulary in the PCFG, so that the inherent structural rule of the password is difficult to embody. 2) Except for keyboard strings, the vocabulary generated by the PCFG is derived only from the training set. 3) The resulting structure is also the structure that appears in the training set. This directly results in the PCFG approach being highly dependent on the training set, and the guessing dictionary generated by the PCFG cannot contain the vocabulary or structural patterns of passwords that do not appear in the training set.
In summary, methods such as the PCFG method have high guessing efficiency and adaptability to multiple languages. However, the existing password guessing method research mainly centers on the development of large-scale real password sets, and the research on a small training set learning method is relatively deficient. The main difficulty is that the number of passwords of a small training set is limited, and the existing training method lacks necessary vocabulary generalization and structural generalization capability, so that the learnable vocabulary and guessing rules are very limited.
Disclosure of Invention
The method aims to solve the problem that the existing password guessing method has poor effect when the training set is small in scale. The present invention has the following improvements over the traditional PCFG password guessing method: 1) the traditional PCFG method adopts a character type mode for word segmentation of a training password, and is difficult to cut a plurality of words of the same type of characters in the password. The invention can cut out the vocabulary of the same character type in the password by using the word segmentation method based on the corpus. 2) The learning process of the PCFG method can only find the words appearing in the training set and their corresponding probabilities, and the resulting dictionary can only contain the words appearing in the training set. When the training set is small, the vocabulary of passwords in the generated dictionary is limited, resulting in inefficient guessing. The invention can expand the same type of vocabulary which does not appear in the training set based on the existing large-scale natural language corpus, and uses a smoothing method to calculate the probability of all the vocabulary in the corpus, and the generated dictionary can contain the vocabulary which does not appear in the training set. Therefore, the dependence on the training set can be effectively reduced, and the same type of vocabulary is expanded in the dictionary. 3) When the cracking rate is estimated, the probability corresponding to the appointed guessing times is estimated, then the maximum probability of each password in the test set is calculated and compared with the probability, and the person who is more than the probability is always in the dictionary generated by the method, so that the cracking rate detection efficiency can be effectively improved.
The purpose of the invention is realized by at least one of the following technical solutions.
A password guessing method for improving guessing efficiency of a small training set based on a corpus comprises the following steps:
s1, constructing corpus comprising four types of corpus setsΓDetermining the structure of the password guessing rule;
s2 corpus-basedΓTraining set against passwordsPWD_TRAINTraining password inpwdGenerating guess rules for the passwordrObtaining a password guess rule set composed of a plurality of password guess rulesR
S3 corpus-basedΓAnd password guessing rule setRComputing a corpusΓEvery word inwProbability of (2), is recorded asp(wPWD_TRAIN),wΓ(ii) a Computing a set of password guessing rulesRGuessing rule of each password inrProbability of (2), is recorded asq(r, PWD_TRAIN),rR
S4, generating guess times asSDictionary (2)D(S) Using dictionariesD(S) To carry outPassword guessing.
Further, in step S1, a corpus having the following characteristics is constructedΓ
Feature 1, corpusΓIncludesΓThe set of | corpora is set,Γ ={C i |1≤i≤|Γl } in whichC i Is as followsiA corpus collection;
the characteristic 2 is that each corpus set comprises vocabularies of the same type and the same length;
characteristics 3, vocabulary types of the corpus set comprise language, country and region, general and violent corpora; language type corpora include vocabulary, surnames and first names in different languages (e.g., English, Russian, etc.); the national and regional corpora comprise place names and telephone numbers; the universal language material comprises common keyboard character sequences, year and date formats;
4, the length of all vocabularies in a single corpus set of the non-violent corpus is the same and is more than or equal to 4;
5, the length of the violent corpus set is less than or equal to 3, and the violent corpus set is divided into lower case letters, capital letters, numbers and special symbols; the total violent corpus is 12: ASCII code lower case letters [ az _ 1] with length of 1-3], [az_2], [az_3](the number is 26, 26 respectively)2,263) And the length of the capital letters [ AZ _ 1] of the ASCII code is 1-3], [AZ_2], [AZ_3](the number is 26, 26 respectively)2,263) Number [09_ 1] of length 1-3], [09_2], [09_3](the number is 10, 10 respectively)2,103) 1-3 ASCII code other printable characters SP _1], [SP_2], [SP_3](the number is 33, 33 respectively)2,333);
Feature 6, corpusΓAny two corpus sets do not comprise the same vocabulary;
first, theiCorpus collectionC i The number of Chinese words is defined asC i Length is defined asl(C i );
A password guessing rulerIs formed by connecting a plurality of corpus sets and a password guessing rulerIs described asr=[C 1]…[C s ],C 1,…,C s ΓsRepresenting password guessing rulesrNumber of stages of (2), isd(r);
Figure DEST_PATH_IMAGE001
Called password guessing rulerThe corpus space size of (1)S(r);
|RI bar mutually different password guessing rulerForming a set of password guessing rulesR
Further, in step S2, the password training setPWD_TRAINComprising a plurality of training passwordspwdBased on a corpusΓGenerating specific training passwordspwdPassword guessing rulerThe method comprises the following steps:
based on corpusΓConstructing a single training passwordpwdDirected acyclic graph ofG=<V, E>Wherein, there is a directed acyclic graphGEach edge in (1) is a corpusΓThe corpus collection to which the character substring from the starting point to the end point of the edge belongs;
generating directed acyclic graphsGAll paths from the starting point to the end point in the training password, each path corresponding to a training passwordpwdEach word segmentation method corresponds to a guess rule;
selecting the guess rule with the smallest segment number from all possible guess rules as the corresponding training passwordpwdPassword guessing rulerIf there are several guess rules with the minimum segment number, the guess rule with the minimum corpus size space is selected as the corresponding password guess ruler
Finally, the guessing rule of a plurality of passwords is obtainedrSet of composed password guessing rulesR
Further, in step S3, the corpus is searchedΓAnd training password setPWD_TRAINComputing a corpusΓEvery word inwSet of probability and guessing rulesRGuessing rule of each password inrThe probability of (c). The method comprises the following specific steps:
password guessing rule setRGuessing rule of each password inrProbability of (2), is recorded asq(r, PWD_TRAIN),rR
Password guessing rule setRGuessing rule of each password inrThe corresponding probabilities have the following characteristics:
1) password guessing rule setRPer password guessing rule inrAre all based on a training password setPWD_TRAINEach training password inpwdStep S2 is executed to generate;
2) password guessing rule setRGuessing rule of each password inrThe sum of the frequencies of (a) equals 1;
3) password guessing rule setRGuessing rule of each password inrIs proportional to its probability of being in the training password setPWD_ TRAINFrequency of occurrence in;
corpusΓEvery word inwProbability of (2), is recorded asp(w,PWD_TRAIN)wΓ
CorpusΓThe probability of each word in has the following characteristics:
1) statistical corpusΓThe frequency of each vocabulary in the training set. Then, each corpus is collectedCThe frequency of all the words in the corpus is added with 1, so that the frequency of the words which do not appear in the corpus is not 0;
2) corpus collectionCThe probability of each vocabulary in the vocabulary set is equal to the frequency of the vocabulary obtained in the step 1) divided by the sum of the frequencies of all the vocabularies in the corpus set;
3) each corpus collectionCThe sum of the probabilities of the Chinese vocabulary is equal to 1;
4) if corpus collectionCThe probability of a particular vocabulary in (1) does not appear in the training password set, which is inversely proportional to the corpus setCSum of the number of words in (1);
5) if corpus collectionCThe probability of a particular word in (1) occurring in the training password set is proportional to the frequency of the word in the training password set and inversely proportional to the corpus setCSum of the number of Chinese vocabularies.
Further, in step S4, the rule set is guessed for the passwordROne of theRule password guessing ruler=[C 1]…[C s ]Andseach wordw 1,…,w s Satisfy the requirement ofw 1C 1, w 2C 2,…,w s C s C 1,…,C s ΓLet, callw 1|…|w s Is a corpus-based databaseΓAnd password guessing rule setRThe legal vocabulary combination of (1), wherein '|' is a string splicing operation;
legal vocabulary combinationw 1|…|w s Probability of becoming password Prob (w 1|…|w s ) Is defined as:
Prob(w 1|…|w s )=∏ i s1≤≤ p(w i , PWD_TRAINq(r, PWD_TRAIN) ;
given number of guessesSIf, ifSBased on corpusΓAnd password guessing rule setROrdered sequence of legal vocabulary combinations ofD(S)=<cp 1,cp 2,…,cp S >Satisfies the following conditions:
condition 1, Prob (cp j )≥Prob(cp j+1), 1≤jS-1;
Condition 2, comprisingSOrdered sequence of individual legal vocabulary combinationsD(S) Last valid vocabulary combination in (1)cp S Has a greater probability than all other non-occurrences inD(S) The probability that the legal vocabulary in (1) is combined into a password;
then callD(S) Guessing the number of times asSThe ordered dictionary of (a) is,D(S) To middleSProbability Prob of a legitimate vocabulary combining into a password (cp S ) Is marked asα(S)。
Further, a character stringstrPossibly described as multiple legal wordsCombinations, each legitimate vocabulary combination having a different probability of becoming a password;
character stringstrProbability of becoming password Prob (str) The probability that all the legal vocabulary combinations corresponding to the character string become the maximum of the password probability is defined, and if one character string cannot be described as a legal vocabulary combination, the probability that the character string becomes the password is 0.
To includeSOrdered dictionary of individual legal vocabulary combinationsD(S) Having the following properties:
property 1 if legal vocabulary combinationcpProbability of becoming password Prob (cp) Is greater thanα(S) Then the legal vocabulary combinationcpMust belong toD(S)。
Property 2 if stringstrProbability of becoming password Prob (str) Is greater thanα(S) Then the character stringstrMust belong toD(S)。
Compared with the prior art, the invention has the advantages that:
1) the training password set is participled based on a natural language corpus and a password guessing rule based on the corpus is generated. The guessing rules generated are more reflective of the inherent meaning of the password setting than the PCFG approach.
2) The vocabulary in the dictionary is expanded and generated by adopting the natural language corpus, so that the problem that the vocabulary discovered by the PCFG method completely depends on the training set can be effectively solved, and the defect of poor guessing effect of the method under the condition of a small training set is overcome.
Drawings
FIG. 1 is a flowchart illustrating steps of a password guessing method for improving guessing efficiency of a small training set based on a corpus according to the present invention.
FIG. 2 is a simplified directed acyclic graph generated by the training password "lovelain" in an embodiment of the present invention.
Fig. 3 is a schematic diagram illustrating the comparison of the cracking rate of the PCFG in password set rockyou in the embodiment of the present invention.
FIG. 4 is a diagram illustrating the comparison of the cracking rate of the PCFG in the password set CSDN in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Example 1:
a password guessing method for improving guessing efficiency of a small training set based on a corpus comprises the following steps as shown in FIG. 1:
s1, constructing corpus comprising four types of corpus setsΓDetermining the structure of the password guessing rule;
constructing a corpus with the following characteristicsΓ
Feature 1, corpusΓIncludesΓThe set of | corpora is set,Γ ={C i |1≤i≤|Γl } in whichC i Is as followsiA corpus collection;
the characteristic 2 is that each corpus set comprises vocabularies of the same type and the same length;
characteristics 3, vocabulary types of the corpus set comprise language, country and region, general and violent corpora; language type corpora include vocabulary, surnames and first names in different languages (e.g., English, Russian, etc.); the national and regional corpora comprise place names and telephone numbers; the universal language material comprises common keyboard character sequences, year and date formats;
4, the length of all vocabularies in a single corpus set of the non-violent corpus is the same and is more than or equal to 4;
5, the length of the violent corpus set is less than or equal to 3, and the violent corpus set is divided into lower case letters, capital letters, numbers and special symbols; the total violent corpus is 12: ASCII code lower case letters [ az _ 1] with length of 1-3], [az_2], [az_3](the number is 26, 26 respectively)2,263) And the length of the capital letters [ AZ _ 1] of the ASCII code is 1-3], [AZ_2], [AZ_3](the number is 26, 26 respectively)2,263) Number [09_ 1] of length 1-3], [09_2], [09_3](the number is 10, 10 respectively)2,103) 1-3 ASCII code other printable characters SP _1], [SP_2], [SP_3](the number is 33, 33 respectively)2,333);
Feature 6, corpusΓAny two corpus sets do not comprise the same vocabulary;
first, theiCorpus collectionC i The number of Chinese words is defined asC i Length is defined asl(C i );
A password guessing rulerIs formed by connecting a plurality of corpus sets and a password guessing rulerIs described asr=[C 1]…[C s ],C 1,…,C s ΓsRepresenting password guessing rulesrNumber of stages of (2), isd(r);
Figure 966015DEST_PATH_IMAGE001
Called password guessing rulerThe corpus space size of (1)S(r);
|RI bar mutually different password guessing rulerForming a set of password guessing rulesR
S2 corpus-basedΓTraining set against passwordsPWD_TRAINTraining password inpwdGenerating guess rules for the passwordrObtaining a password guess rule set composed of a plurality of password guess rulesR
Password training setPWD_TRAINComprising a plurality of training passwordspwdBased on a corpusΓGenerating specific training passwordspwdPassword guessing rulerThe method comprises the following steps:
based on corpusΓConstructing a single training passwordpwdDirected acyclic graph ofG=<V, E>Wherein, there is a directed acyclic graphGEach edge in (1) is a corpusΓThe corpus collection to which the character substring from the starting point to the end point of the edge belongs;
generating directed acyclic graphsGAll paths from the starting point to the end point in the training password, each path corresponding to a training passwordpwdEach word segmentation method corresponds to a guess rule;
guess of selecting minimum number of segments from all possible guess rulesRules as corresponding training passwordspwdPassword guessing rulerIf there are several guess rules with the minimum segment number, the guess rule with the minimum corpus size space is selected as the corresponding password guess ruler
Finally, the guessing rule of a plurality of passwords is obtainedrSet of composed password guessing rulesR
In this embodiment, the rule generation of the training password is shown as algorithm-1;
algorithm-1: rule generation for training passwords
Inputting: (1)ntraining password formed by characterspwd=c 1… c n
(2) CorpusΓ={C i }
And (3) outputting:pwdcorresponding ruler
Intermediate variables: (1) acyclic graphGSet of vertices ofVAnd edge setE
(2) Temporary rule setR 0AndR 1
1. building a set of vertices of a graphV={v 1,…, v n , v n+1}
2. To pairc 1… c n Chinese character stringc i c j Circulation, wherein 1 is less than or equal toi<j≤n
2.1 corpus aggregation if presentCThe requirements are met,c i c j ÎCCΓthen, then
2.1.1 E=E∪{<(v i , v j+1), C>}
3. R 0=Ø;
4. To the slavev 1Tov n+1All paths ofpathCirculation of
4.1 pathThe sequence of edges experienced is<< v 1, v 2’, C 1>,…<v k ’, v n+1, C s >>
4.2 R 0 =R 0∪{[C 1]… [C s ]}
5. d min=min{d(r) |rÎR 0}
6. R 1={r|rÎR 0d(r)= d min}
7. Return toR 1Rule for minimizing space size of Chinese corpusr
S3 corpus-basedΓAnd password guessing rule setRComputing a corpusΓEvery word inwProbability of (2), is recorded asp(wPWD_TRAIN),wΓ(ii) a Computing a set of password guessing rulesRGuessing rule of each password inrProbability of (2), is recorded asq(r, PWD_TRAIN),rR
For corporaΓAnd training password setPWD_TRAINComputing a corpusΓEvery word inwSet of probability and guessing rulesRGuessing rule of each password inrThe probability of (c). The method comprises the following specific steps:
password guessing rule setRGuessing rule of each password inrProbability of (2), is recorded asq(r, PWD_TRAIN),rR
Password guessing rule setRGuessing rule of each password inrThe corresponding probabilities have the following characteristics:
1) password guessing rule setRPer password guessing rule inrAre all based on a training password setPWD_TRAINEach training password inpwdStep S2 is executed to generate;
2) password guessing rule setRGuessing rule of each password inrThe sum of the frequencies of (a) equals 1;
3) password guessing rule setRGuessing rule of each password inrIs proportional to its probability of being in the training password setPWD_ TRAINFrequency of occurrence in;
corpusΓEvery word inwProbability of (2), is recorded asp(w,PWD_TRAIN)wΓ
CorpusΓThe probability of each word in has the following characteristics:
1) statistical corpusΓThe frequency of each vocabulary in the training set. Then, each corpus is collectedCThe frequency of all the words in the corpus is added with 1, so that the frequency of the words which do not appear in the corpus is not 0;
2) corpus collectionCThe probability of each vocabulary in the vocabulary set is equal to the frequency of the vocabulary obtained in the step 1) divided by the sum of the frequencies of all the vocabularies in the corpus set;
3) each corpus collectionCThe sum of the probabilities of the Chinese vocabulary is equal to 1;
4) if corpus collectionCThe probability of a particular vocabulary in (1) does not appear in the training password set, which is inversely proportional to the corpus setCSum of the number of words in (1);
5) if corpus collectionCThe probability of a particular word in (1) occurring in the training password set is proportional to the frequency of the word in the training password set and inversely proportional to the corpus setCSum of the number of Chinese vocabularies.
In this embodiment, the calculation of the rule probability and the vocabulary probability is shown as algorithm 2;
algorithm-2 calculation of rule probabilities
Inputting: (1) training password setPWD_TRAIN
(2) CorpusΓ={C i }
And (3) outputting: (1) password guessing rule setR
(2)REach rulerProbability of (2)q(r,PWD_TRAIN), rR
(3)ΓEach of the words inwProbability of (2)p(wPWD_TRAIN),wΓ
Intermediate variables: (1)Γfrequency of each word inf(w, PWD_TRAIN),wΓ
1. R=∅
2. f(w, PWD_TRAIN)=0,wΓ
3. For allpwdPWD_TRAINCirculation of
3.1 calculation Using Algorithm-1pwd= c 1… c n Corresponding ruler=[C 1]…[C s ]
3.2 ifrRThen
3.2.1 q(r,PWD_TRAIN)=q(r,PWD_TRAIN)+1/|PWD_TRAIN|
3.3 otherwise
3.3.1 R=Rr
3.3.3 q(r,PWD_TRAIN)=1/|PWD_TRAIN|
3.4 t=1
3.5 iFrom 1 tosCirculation of
3.5.1
Figure 194740DEST_PATH_IMAGE002
3.5.2 f(w, PWD_TRAIN)= f(w, PWD_TRAIN)+1
3.5.3 t=t+l(C i )
4. For allC i ΓCirculation of
4.1
Figure DEST_PATH_IMAGE003
4.2 pairs ofC i The words and phrases in (1)wCirculation of
4.2.1 iff(w, PWD_TRAIN) Not equal to 0, then
4.2.1.1 p(w, PWD_TRAIN)=(f(w, PWD_TRAIN)+1)/fsum
4.2.2 else
4.2.2.1 p(w, PWD_TRAIN)=1/fsum
S4, generating guessThe number of measurements isSDictionary (2)D(S) Using dictionariesD(S) Carrying out password guessing;
guessing a set of rules for a passwordROne rule password guessing ruler=[C 1]…[C s ]Andseach wordw 1,…,w s Satisfy the requirement ofw 1C 1, w 2C 2,…,w s C s C 1,…,C s ΓBalance ofw 1|…|w s Is a corpus-based databaseΓAnd password guessing rule setRThe legal vocabulary combination of (1), wherein '|' is a string splicing operation;
legal vocabulary combinationw 1|…|w s Probability of becoming password Prob (w 1|…|w s ) Is defined as:
Prob(w 1|…|w s )=∏ i s1≤≤ p(w i , PWD_TRAINq(r, PWD_TRAIN) ;
given number of guessesSIf, ifSBased on corpusΓAnd password guessing rule setROrdered sequence of legal vocabulary combinations ofD(S)=<cp 1,cp 2,…,cp S >Satisfies the following conditions:
condition 1, Prob (cp j )≥Prob(cp j+1), 1≤jS-1;
Condition 2, comprisingSOrdered sequence of individual legal vocabulary combinationsD(S) Last valid vocabulary combination in (1)cp S Has a greater probability than all other non-occurrences inD(S) The probability that the legal vocabulary in (1) is combined into a password;
then callD(S) Guessing the number of times asSThe ordered dictionary of (a) is,D(S) To middleSCombining individual legal words into passwordsProbability Prob (cp S ) Is marked asα(S)。
A character stringstrIt is possible to describe a plurality of legal vocabulary combinations, each of which has a different probability of becoming a password;
character stringstrProbability of becoming password Prob (str) The probability that all the legal vocabulary combinations corresponding to the character string become the maximum of the password probability is defined, and if one character string cannot be described as a legal vocabulary combination, the probability that the character string becomes the password is 0.
To includeSOrdered dictionary of individual legal vocabulary combinationsD(S) Having the following properties:
property 1 if legal vocabulary combinationcpProbability of becoming password Prob (cp) Is greater thanα(S) Then the legal vocabulary combinationcpMust belong toD(S)。
Property 2 if stringstrProbability of becoming password Prob (str) Is greater thanα(S) Then the character stringstrMust belong toD(S)。
In this embodiment, the ordered dictionary may be generated using the next algorithm of the references (Matt Weir, Sudhir Afflawa, Breno de Mediaros, Bill Glodek, "creating using basic context-free dictionary", in Proc. 30th IEEE Symposium on Security and Privacy, 2009, pp.391-405.)D(S)
S5, estimating dictionary according to guess timesD(S) The probability of the last legal vocabulary combination;
in this embodiment, since the dictionaryD(S) Is slow and guesses timesSWhen the size is large, the storage capacity required by the dictionary is large, so that the dictionary is difficult to generate to evaluate the cracking rate; guessing rule set based on passwordR,{q(rPWD_ TRAIN)|rR}, Γ, {p(wPWD_TRAIN)|wΓFor a given probability }βThe literature (Dell' Amico, M) was used.& Filippone, M., Monte Carlo Strength Evaluation: Fast and Reliable PasswThe Monte Carlo sampling method introduced in ord Checking, Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, ACM 2015, 158-169.) is calculated to have a password probability greater thanβIs estimated, this process is noted asN(β);
Computingα(S) Is estimated value of
Figure 385198DEST_PATH_IMAGE004
The method comprises the following steps:
first, a first probability value is initializedα 0And a second probability valueα 1Satisfy the following requirementsSIntermediate first probability values estimated using Monte Carlo sampling methodα 0And a second probability valueα 1Corresponding number of guessesN(α 0) AndN(α 1) To (c) to (d); then continuously adjusting the first probability valueα 0And a second probability valueα 1So thatN((α 0+α 1) /2) approach toS(ii) a When the oxygen deficiency is reachedN((α 0+α 1)/2)-S|<0.1SWhen takingα 0+α 1) A/2 isα(S) Is estimated value of
Figure 91992DEST_PATH_IMAGE004
In the present embodiment, the dictionary is estimatedD(S) The probability of the last vocabulary combination to become a password is shown in algorithm 3;
algorithm-3 estimation dictionaryD(S) Is combined into the probability of a password
Inputting: (1) guessing rule setR
(2) RProbability of each rule inq(rPWD_TRAIN)|rR};
(3) CorpusΓ
(4) ΓThe probability of each word inp(wPWD_TRAIN)|wΓ}
(5) Number of guessesS
And (3) outputting:α(S) Is estimated value of
Figure 220878DEST_PATH_IMAGE004
1. Selectingα 0Andα 1satisfy the following requirementsN(α 0)<S<N(α 1)
2. When in use
Figure DEST_PATH_IMAGE005
Circulation of
2.1 if
Figure 636816DEST_PATH_IMAGE006
Then, then
2.1.1 α 0=
Figure DEST_PATH_IMAGE007
2.2 otherwise
2.2.1 α 1=
Figure 727351DEST_PATH_IMAGE007
3. Return to
Figure 240765DEST_PATH_IMAGE007
S6, generating dictionary in non-actual conditionD(S) Estimate dictionary in case ofD(S)The cracking rate of the test password set;
dictionary based on estimation in step S5D(S) Probability of the last legal vocabulary combination of
Figure 486808DEST_PATH_IMAGE004
Sequentially calculating the probability of each character string in the test password set becoming the password, if the probability of the character string becoming the password is greater than
Figure 496483DEST_PATH_IMAGE004
Then it indicates that the character string belongs to the dictionaryD(S);
All belongings in the training setD(S) The number of character strings divided by the total number of character strings in the training set is equal to the number of guessesSThe invention is used for testing the cracking rate of the password set.
In this embodiment, the test password set is detectedPWD_TESTThe cracking rate of (2) is shown as an algorithm-4;
algorithm-4 test password setPWD_TESTCracking rate of
Inputting: (1) guessing rule setR
(2) RProbability of each rule inq(rPWD_TRAIN)|rR};
(3) CorpusΓ
(4) ΓThe probability of each word inp(wPWD_TRAIN)|wΓ}
(5) Number of guessesS
(6) Testing password setPWD_TEST
And (3) outputting: based on training setPWD_TRAINAnd corpusΓThe number of guesses isSThe generated dictionaryD(S) For test password setPWD_TESTCracking rate ofγ(PWD_TRAIN, Γ,PWD_TEST, S)
1. g=0
2. Based onR, {q(rPWD_TRAIN)|rR}, Γ,{p(wPWD_TRAIN)|wΓAndSusing Algorithm-3 calculation
Figure 825046DEST_PATH_IMAGE004
3. To pairpwdPWD_TESTCirculation of
3.1 if Prob: (pwd)>
Figure 342484DEST_PATH_IMAGE004
Then, then
3.1.1 g=g+1
4. Return tog/|PWD_TEST|
As shown in fig. 1The implementation of the invention needs to be composed of two parts of data and software, wherein the data comprises a corpusΓTraining password setPWD_TRAINTesting of password setsPWD_TEST. The software comprises two parts, namely training software, cracking rate detection software and the like. Wherein, the algorithm-1 and the algorithm-2 are completed in the training software, and the algorithm-3 and the algorithm-4 are completed by the cracking rate detection software.
Example 2-generating password guessing rules for the training password "lovelain";
corpusΓ sample Besides the violent character set, the English vocabulary set EN _4= { love, rain, blue } with the length of 4 characters, and the English vocabulary set EN _5= { love, green } with the length of 5 characters. The simplified directed acyclic graph generated by algorithm-1 is shown in fig. 2. In this figure, the following rules may be generated:
r 1-lovereain: [EN_4][EN_4]the number of segments is 2, and the corpus space size is 3 × 3= 9;
r 2-loverain: [EN_5][az_3]the number of segments is 2, and the spatial size of the corpus is 2 multiplied by 263=35152;
r 3-loverain: [EN_4][az_2][az_2]The number of segments is 3, and the spatial size of the corpus is 3 multiplied by 262×262=1370928;
Of all password guessing rules that lovelain can produce,r 1-loverainwith the least number of segments and the smallest corpus space size in the guessing rule with the least number of segments. Thus, the training password obtains a password guessing rule ofr 1-loverain: [EN_4][EN_4]。
Example 3-set of training passwordsPWD_TRAIN SAMPLE The training result and a plurality of legal vocabulary combinations;
training password setPWD_TRAIN SAMPLE ={loverain, loveblue, greenblue, love3}。
After using algorithm-2, the training results obtained were:
1) password guessing rule setRContains 3 password guessing rules:r 1 =[EN_4][EN_4], r 2 = [EN_5][EN_4], r 3 = [EN_4][09_1];
2) probability of password guessing rule:q(r 1)=0.5,q(r 2)=0.25, q(r 3)=0.25;
3) probability of vocabulary in EN _ 4:p(“love”)=4/9, p(“blue”)=3/9, p(“rain”)=2/9, p("green") = 2/3; probability of vocabulary in EN _%p("over") = 1/3; 09_1 except thatp("3") =2/27 and all other probabilities are 1/27;
the following gives two legal vocabulary combinations that can be generated based on the above training results and their probability of becoming a password (keeping 4 significant decimals).
(lover|love)= q(r 2p(lover)×p(love)=0.0370;
Prob(blue|4)= q(r 3p(blue)×p(4)= 0.0031。
Example 3 comparison of the PCFG method;
two large-scale password sets for Rockyou and CSDN. Randomly selecting passwords from the training set and the test set according to the proportion of 3:1, 1:3 and 1:27 respectively. Training the training set by respectively using a classical PCFG method and the method provided by the invention, and testing the cracking rate of the test set, as shown in figures 3 and 4.
The following characteristics can be seen from the above tests:
1) the cracking rate of the PCFG and the PCFG depends on the size of the training set, and the larger the training set is, the higher the cracking rate is. In a small training set (1: 27) and guesses of 1011Under the condition, the cracking rates of the PCFG method and the Rockyou password set are 85.59% and 51.68% respectively, and the cracking rates of the PCFG method and the Rockyou password set are 74.10% and 30.03% respectively, which are respectively and relatively improved by 65.6% and 146%. The invention is obviously improved compared with the classical PCFG method.
2) As the number of guesses increases, the cracking rate of the PCFG does not increase significantly. In the case of a small training set (1: 27), the number of guesses is selected from1011Is lifted to 1014In time, the cracking rate of the PCFG method to Rockyou is only improved from 51.68% to 52.97%, and is only improved by 1.29%. Under the same condition, the cracking rate of the Rockyou in the invention is improved from 85.59% to 92.53%, and is improved by 6.94%. The invention shows that the increase of the cracking rate is more obvious than that of the PCFG when the guessing times are increased.
3) The present invention is insensitive to the size of the training set at the same number of guesses. For example, for the Rockyou password set, the number of guesses is 1011Then, for 3:1 large training set and 27: 1, the cracking rate of the PCFG method is rapidly reduced from 70.79% to 51.68% and reduced by 19.11%, while the cracking rate of the PCFG method is reduced from 87.70% to 85.59% and reduced by only 2.11%.
The test shows that the invention expands the vocabulary in the training set, has higher cracking rate than the prior PCFG method under the same guess times, and can still keep more stable cracking rate under the condition of reducing the training set.

Claims (10)

1. A password guessing method for improving guessing efficiency of a small training set based on a corpus is characterized by comprising the following steps of:
s1, constructing corpus comprising four types of corpus setsΓDetermining the structure of the password guessing rule; constructing a corpus with the following characteristicsΓ
Feature 1, corpusΓIncludesΓThe set of | corpora is set,Γ ={C i |1≤i≤|Γl } in whichC i Is as followsiA corpus collection;
the characteristic 2 is that each corpus set comprises vocabularies of the same type and the same length;
characteristics 3, vocabulary types of the corpus set comprise language, country and region, general and violent corpora;
4, the length of all vocabularies in a single corpus set of the non-violent corpus is the same and is more than or equal to 4;
5, the length of the violent corpus set is less than or equal to 3, and the violent corpus set is divided into lower case letters, capital letters, numbers and special symbols;
feature 6, corpusΓAny two corpus sets do not comprise the same vocabulary;
first, theiCorpus collectionC i The number of Chinese words is defined asC i Length is defined asl(C i );
S2 corpus-basedΓTraining set against passwordsPWD_TRAINTraining password inpwdGenerating guess rules for the passwordrObtaining a password guess rule set composed of a plurality of password guess rulesR
S3 corpus-basedΓAnd password guessing rule setRComputing a corpusΓEvery word inwThe probability of (d); computing a set of password guessing rulesRGuessing rule of each password inrThe probability of (d);
s4, generating guess times asSDictionary (2)D(S) Using dictionariesD(S) Password guessing is performed.
2. The corpus-based password guessing method for improving guessing efficiency of small training sets according to claim 1,
a password guessing rulerIs formed by connecting a plurality of corpus sets and a password guessing rulerIs described asr=[C 1]…[C s ],C 1,…,C s ΓsRepresenting password guessing rulesrNumber of stages of (2), isd(r);
Figure 505154DEST_PATH_IMAGE001
Called password guessing rulerThe corpus space size of (1)S(r);
|RI bar mutually different password guessing rulerForming a set of password guessing rulesR
3. Corpus-based lifting gadget according to claim 1A password guessing method for guessing efficiency of a training set, characterized in that in step S2, the password training set is subjected to password guessingPWD_TRAINComprising a plurality of training passwordspwdBased on a corpusΓGenerating specific training passwordspwdPassword guessing ruler
4. The method for guessing passwords according to claim 1, wherein in step S2, the method for guessing passwords based on corpus-based training set includesΓConstructing a single training passwordpwdDirected acyclic graph ofG=<V, E>Wherein, there is a directed acyclic graphGEach edge in (1) is a corpusΓThe corpus collection to which the character substring from the starting point to the end point of the edge belongs;
generating directed acyclic graphsGAll paths from the starting point to the end point in the training password, each path corresponding to a training passwordpwdEach word segmentation method corresponds to a guess rule;
selecting the guess rule with the smallest segment number from all possible guess rules as the corresponding training passwordpwdPassword guessing rulerIf there are several guess rules with the minimum segment number, the guess rule with the minimum corpus size space is selected as the corresponding password guess ruler
Finally, the guessing rule of a plurality of passwords is obtainedrSet of composed password guessing rulesR
5. The corpus-based password guessing method for improving guessing efficiency of small training set as claimed in claim 1, wherein in step S3, the password guessing rule setRGuessing rule of each password inrProbability of (2), is recorded asq(r, PWD_TRAIN),rR
Password guessing rule setRGuessing rule of each password inrThe corresponding probabilities have the following characteristics:
1) password guessing rule setRPer password guessing rule inrAre all based on a training password setPWD_TRAINEach training password inpwdStep S2 is executed to generate;
2) password guessing rule setRGuessing rule of each password inrThe sum of the frequencies of (a) equals 1;
3) password guessing rule setRGuessing rule of each password inrIs proportional to its probability of being in the training password setPWD_TRAINThe frequency of occurrence of (c).
6. The method for guessing passwords according to claim 1, wherein in step S3, the corpus is used to improve guessing efficiency of small training setsΓEvery word inwProbability of (2), is recorded asp(wPWD_TRAIN),wΓ
CorpusΓThe probability of each word in has the following characteristics:
1) statistical corpusΓThe frequency of each vocabulary in the training set is determined, and then each corpus is collectedCThe frequency of all the words in the corpus is added with 1, so that the frequency of the words which do not appear in the corpus is not 0;
2) corpus collectionCThe probability of each vocabulary in the vocabulary set is equal to the frequency of the vocabulary obtained in the step 1) divided by the sum of the frequencies of all the vocabularies in the corpus set;
3) each corpus collectionCThe sum of the probabilities of the Chinese vocabulary is equal to 1;
4) if corpus collectionCThe probability of a particular vocabulary in (1) does not appear in the training password set, which is inversely proportional to the corpus setCSum of the number of words in (1);
5) if corpus collectionCThe probability of a particular word in (1) occurring in the training password set is proportional to the frequency of the word in the training password set and inversely proportional to the corpus setCSum of the number of Chinese vocabularies.
7. The corpus-based password guessing method for improving guessing efficiency of small training set according to claim 1, wherein in step S4, the password guessing rule set is chosen as the password guessing rule setROne rule password guessing ruler=[C 1]…[C s ]Andseach wordw 1,…,w s Satisfy the requirement ofw 1C 1, w 2C 2,…,w s C s C 1,…,C s ΓBalance ofw 1|…|w s Is a corpus-based databaseΓAnd password guessing rule setRThe legal vocabulary combination of (1), wherein '|' is a string splicing operation;
legal vocabulary combinationw 1|…|w s Probability of becoming password Prob (w 1|…|w s ) Is defined as:
Prob(w 1|…|w s )=∏ i s1≤≤ p(w i , PWD_TRAINq(r, PWD_TRAIN) 。
8. the corpus-based password guessing method for improving guessing efficiency of small training set as claimed in claim 7, wherein the number of guesses is givenSIf, ifSBased on corpusΓAnd password guessing rule setROrdered sequence of legal vocabulary combinations ofD(S)=<cp 1,cp 2,…,cp S >Satisfies the following conditions:
condition 1, Prob (cp j )≥Prob(cp j+1), 1≤jS-1;
Condition 2, comprisingSOrdered sequence of individual legal vocabulary combinationsD(S) Last valid vocabulary combination in (1)cp S Has a greater probability than all other non-occurrences inD(S) The probability that the legal vocabulary in (1) is combined into a password;
then callD(S) Guessing the number of times asSThe ordered dictionary of (a) is,D(S) To middleSProbability Prob of a legitimate vocabulary combining into a password (cp S ) Is marked asα(S)。
9. The corpus-based password guessing method for improving guessing efficiency of small training set as claimed in claim 8, wherein one character stringstrIt is possible to describe a plurality of legal vocabulary combinations, each of which has a different probability of becoming a password;
character stringstrProbability of becoming password Prob (str) The probability that all the legal vocabulary combinations corresponding to the character string become the maximum of the password probability is defined, and if one character string cannot be described as a legal vocabulary combination, the probability that the character string becomes the password is 0.
10. The corpus-based efficiently guessing password guessing method for training set based on claims 9, wherein the method comprisesSOrdered dictionary of individual legal vocabulary combinationsD(S) Having the following properties:
property 1 if legal vocabulary combinationcpProbability of becoming password Prob (cp) Is greater thanα(S) Then the legal vocabulary combinationcpMust belong toD(S);
Property 2 if stringstrProbability of becoming password Prob (str) Is greater thanα(S) Then the character stringstrMust belong toD(S)。
CN202111478071.9A 2021-12-06 2021-12-06 Password guessing method for improving guessing efficiency of small training set based on corpus Active CN113886784B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111478071.9A CN113886784B (en) 2021-12-06 2021-12-06 Password guessing method for improving guessing efficiency of small training set based on corpus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111478071.9A CN113886784B (en) 2021-12-06 2021-12-06 Password guessing method for improving guessing efficiency of small training set based on corpus

Publications (2)

Publication Number Publication Date
CN113886784A true CN113886784A (en) 2022-01-04
CN113886784B CN113886784B (en) 2022-04-22

Family

ID=79015643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111478071.9A Active CN113886784B (en) 2021-12-06 2021-12-06 Password guessing method for improving guessing efficiency of small training set based on corpus

Country Status (1)

Country Link
CN (1) CN113886784B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107947921A (en) * 2017-11-22 2018-04-20 上海交通大学 Based on recurrent neural network and the password of probability context-free grammar generation system
CN109829289A (en) * 2019-01-09 2019-05-31 中国电子科技集团公司电子科学研究院 Password guess method
CN110555140A (en) * 2019-08-29 2019-12-10 华南理工大学 Description, generation and detection method of corpus product rule oriented to password guess
US20200074073A1 (en) * 2018-08-31 2020-03-05 Briland Hitaj System and process for generating passwords or password guesses
CN111191008A (en) * 2019-12-31 2020-05-22 华东师范大学 Password guessing method based on numerical factor reverse order
CN111241534A (en) * 2020-01-13 2020-06-05 西安电子科技大学 Password guess set generation system and method
CN112149388A (en) * 2020-09-25 2020-12-29 华南理工大学 Method for identifying vocabulary deformation in password and generating guessing rule
CN112861113A (en) * 2021-01-08 2021-05-28 复旦大学 Password guessing method of parameterized hybrid model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107947921A (en) * 2017-11-22 2018-04-20 上海交通大学 Based on recurrent neural network and the password of probability context-free grammar generation system
US20200074073A1 (en) * 2018-08-31 2020-03-05 Briland Hitaj System and process for generating passwords or password guesses
CN109829289A (en) * 2019-01-09 2019-05-31 中国电子科技集团公司电子科学研究院 Password guess method
CN110555140A (en) * 2019-08-29 2019-12-10 华南理工大学 Description, generation and detection method of corpus product rule oriented to password guess
CN111191008A (en) * 2019-12-31 2020-05-22 华东师范大学 Password guessing method based on numerical factor reverse order
CN111241534A (en) * 2020-01-13 2020-06-05 西安电子科技大学 Password guess set generation system and method
CN112149388A (en) * 2020-09-25 2020-12-29 华南理工大学 Method for identifying vocabulary deformation in password and generating guessing rule
CN112861113A (en) * 2021-01-08 2021-05-28 复旦大学 Password guessing method of parameterized hybrid model

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BILL GLODEK 等: "Password Cracking Using Probabilistic Context-Free Grammars", 《30TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY》 *
SHIVA HOUSHMAND等: "Next Gen PCFG Password Cracking", 《IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY》 *
王平等: "口令安全研究进展", 《计算机研究与发展》 *
王聪: "口令猜测方法的集成优化与自适应优化的研究与应用", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
颜锐荣 等: "口令中的热词发现与分析", 《密码学报》 *

Also Published As

Publication number Publication date
CN113886784B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
WO2019184217A1 (en) Hotspot event classification method and apparatus, and storage medium
Rao et al. Effect of grammar on security of long passwords
CN110555140B (en) Description, generation and detection method of corpus product rule oriented to password guess
Han et al. Regional patterns and vulnerability analysis of chinese web passwords
US8239349B2 (en) Extracting data
CN112101009B (en) Method for judging similarity of red-building dream character relationship frames based on knowledge graph
CN109829289B (en) Password guessing method
CN109800310A (en) A kind of electric power O&amp;M text analyzing method based on structuring expression
CN107180084A (en) Word library updating method and device
JP5337308B2 (en) Character string generation method, program and system
CN111241303A (en) Remote supervision relation extraction method for large-scale unstructured text data
WO2020206909A1 (en) Method and apparatus for calculating password strength, and computer-readable storage medium
US8620961B2 (en) Mention-synchronous entity tracking: system and method for chaining mentions
CN101008941A (en) Successive principal axes filter method of multi-document automatic summarization
CN111506726A (en) Short text clustering method and device based on part-of-speech coding and computer equipment
Cimino et al. Building the state-of-the-art in POS tagging of Italian Tweets
Lhasiw et al. A bidirectional LSTM model for classifying Chatbot messages
CN112149388B (en) Method for recognizing vocabulary deformation in password and generating guessing rule
CN113886784B (en) Password guessing method for improving guessing efficiency of small training set based on corpus
Cui et al. A password strength evaluation algorithm based on sensitive personal information
CN115563604A (en) Password strength evaluation method and system based on deep neural network and feature fusion
CN112632526B (en) User password modeling and strength evaluation method based on comprehensive segmentation
CN113282746B (en) Method for generating variant comment countermeasure text of network media platform
WO2021093871A1 (en) Text query method, text query device, and computer storage medium
CN115358227A (en) Open domain relation joint extraction method and system based on phrase enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant