CN111241534A - Password guess set generation system and method - Google Patents
Password guess set generation system and method Download PDFInfo
- Publication number
- CN111241534A CN111241534A CN202010033647.XA CN202010033647A CN111241534A CN 111241534 A CN111241534 A CN 111241534A CN 202010033647 A CN202010033647 A CN 202010033647A CN 111241534 A CN111241534 A CN 111241534A
- Authority
- CN
- China
- Prior art keywords
- probability
- password
- neural network
- long
- term memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/45—Structures or tools for the administration of authentication
- G06F21/46—Structures or tools for the administration of authentication by designing passwords or checking the strength of passwords
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention belongs to the technical field of information security, and discloses a password guess set generation system and a method, which are used for generating a probability context-free grammar based on personal information and a password database; dividing character strings in the probability context-free grammar into character strings which are suitable or not suitable for training a long-term and short-term memory neural network according to a classification rule; training a convergent long-short term memory neural network model; generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model; mapping the probability corresponding to the password segment into a new probability, and sequencing the password segment according to the new probability in a descending order; passwords are generated that are sorted in descending order of probability. The invention makes up the defects that the long-term and short-term memory neural network can not identify the composition structure and semantic information in the password and has poor interpretability; the defect of poor generalization capability of the probabilistic context-free grammar is overcome; the problem that the probability of generating the password section by the long-term and short-term memory neural network is low is solved.
Description
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a password guess set generation system and method.
Background
Currently, the closest prior art: with the continuous development of networks and portable electronic devices, people's daily life and networks become closely related, and identity authentication gradually becomes a basic means for guaranteeing user information security. Password is widely used as an identity authentication mode due to its characteristics of simplicity and easy deployment, and has become one of the most important means for protecting user information security in the internet world. Thus, the security and usability of passwords have been a significant concern.
The basic idea of password guessing algorithms, one of the important methods for studying password security, is to generate a set of password guesses for guessing the user's password. The most primitive method for generating the guess set of passwords is to enumerate all the passwords meeting the conditions, but the method has poor feasibility and extremely low efficiency because the whole password searching space is extremely large. Subsequently, one tries to select frequently used words or numbers from a dictionary and combine or slightly modify them as a password guess set, which are relatively simple, fail to accurately describe one's habit of generating passwords, and do not describe the probabilities of those passwords, making guessing inefficient. Then, a Probabilistic model, such as a Probabilistic context-free grammar (PCFG) and a Markov (Markov) model, is used to generate a password guess set, but the Markov model can only calculate the probability of occurrence between preceding and following characters and cannot recognize the composition structure and semantic information of the password, and the Probabilistic context-free grammar can recognize the structure and a part of the semantic information in the password but cannot generalize the password and can only combine the password segments appearing in the training set. Recently, a way of generating a password guess set by using a long-short term memory neural network or a generation countermeasure network has appeared, which obtains a better guess effect by utilizing the fitting advantage of the neural network to a high-dimensional space, has a stronger generalization capability, but like a markov model, cannot identify the composition structure and semantic information in the password.
In summary, the problems of the prior art are as follows: the existing password guess set generation technology has the problems that the composition structure, semantic information and generalization capability of the password cannot be considered, so that the generated password guess set cannot accurately describe the habit of generating the password of people.
The difficulty of solving the technical problems is as follows:
because the probability models utilized by the existing password guess set generation technology are different, the probability of the generated passwords follows respective measurement standards, for example, for the same password, the probabilities given by different password guess set generation technologies are different, sometimes even by several orders of magnitude, so the passwords generated by different models cannot be simply merged together according to the probabilities of the passwords generated by different models, and are sorted in descending order of the probabilities.
The significance of solving the technical problems is as follows:
the habit of constructing passwords of people is further understood, and the weak passwords are reduced; the ability of a password to guess an attacker is further understood, so that a way is wanted to avoid; the safety and the usability of the password are deeply understood, and a system administrator is helped to prevent the weak password from being used; the password guessing speed is increased, and better technical guarantee is provided for digital evidence obtaining.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a password guess set generation system and a method.
The invention is realized in such a way that a password guess set generating method comprises the following steps:
firstly, generating a probability context-free grammar based on personal information and a password database; dividing character strings in the probability context-free grammar into character strings which are suitable or not suitable for training a long-term and short-term memory neural network according to a classification rule;
secondly, training a convergent long-short term memory neural network model; generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model;
thirdly, mapping the probability corresponding to the password segment into a new probability, and sequencing the password segment according to the new probability in a descending order; passwords are generated that are sorted in descending order of probability.
Further, the first step is that according to the personal information mode, the matching, division and marking of the personal information mode are carried out on the personal information and the password in the password database by using a longest prefix matching algorithm, the parts which do not match with the personal information are marked as L, D and S respectively according to the letters, the numbers or the special symbols, the length of the parts is represented by subscripts, then the probability of each character string in each mode is calculated, finally the character strings of the basic terminal and the same mode are sorted according to the descending probability, and the probability context-free grammar based on the personal information and the password database is generated.
Further, the personal information mode includes;
the matching, dividing and marking of the patterns means that a longest prefix matching algorithm is used for matching, then the password is divided into sections, and each section is marked according to the matched pattern;
the probability of the personal information mode is 1, and the personal information is fixed;
the modes are character strings of L, D and S, and the formula for calculating the probability of each character string in each mode is as follows:
where n is the number of occurrences of the character string and m is the number of occurrences of the pattern.
Further, the first step classifies the character strings in the generated probabilistic context-free grammar according to classification rules, wherein the classification rules are as follows: if the mode of the character string is a personal information mode, the character string is a character string which is not suitable for training the long-term and short-term memory neural network; if the character string mode is L, D, S mode and the length is less than or equal to 4, the character string is not suitable for training the long-short term memory neural network; if the string pattern is L, D, S pattern and the length is greater than 4, the string is a string suitable for training the long-short term memory neural network.
Further, for each character string suitable for training the long-short term memory neural network, the second step sequentially transmits characters of different moments of the character string into an input layer of the long-short term memory neural network, and the output of the long-short term memory neural network is prediction of a character at the next moment in the character string;
the loss function used in training the long-term and short-term memory neural network is a cross entropy loss function;
the optimization algorithm used in training the long-short term memory neural network is Adam or other self-adaptive learning rate methods;
convergence refers to the parameters of the long-short term memory neural network such that the value of the loss function mathematically converges to a certain value.
Further, the third step maps the probability of the generated password segment with the probability greater than a certain threshold value into a new probability according to a mapping function, and sorts the password segments in descending order according to the new probability, wherein the mapping function is as follows:
wherein p isOld ageSome probability of finger generation is greater than pThreshold valueProbability of password segment of (1), Σ piAll probabilities of finger generation are greater than pThreshold valueOf the password segment, pThreshold valueThe choice is free according to the size of the guess set of passwords desired to be generated.
Further, the generated base in the probability context-free grammar is utilizedThe terminal and the probability; generating passwords and probabilities by classifying the passwords and the probabilities into character strings and character segments which are not suitable for training the long-short term memory neural network and mapped password segments and new probabilities, and generating the passwords sorted in descending order of the probabilities by using a next function, wherein the passwords are generated by replacing personal information modes in a basic terminal one by one into corresponding personal information of corresponding modes of guessing targets, the modes are L, D or S and have subscripts less than or equal to 4 and are replaced into corresponding character strings in a probability context-free grammar, the modes are L, D or S and have subscripts greater than 4 and are replaced into the password segments mapped by a probability mapping module, and the probability is calculated by multiplying the probability of the basic terminal by the probability of the corresponding character strings which are not suitable for training the long-short term memory neural network or the new probability of the password segments mapped by the probability mapping module; the final password probability is calculated by
It is another object of the present invention to provide a password guess set generating system implementing the password guess set generating method, the password guess set generating system comprising:
the grammar generation module is used for generating a probability context-free grammar based on personal information and a password database;
the character string classification module is used for classifying character strings in the probability context-free grammar generated by the grammar generation module into character strings suitable for training the long and short term memory neural network and character strings not suitable for training the long and short term memory neural network according to classification rules;
the model training module is used for training the long-short term memory neural network by using the character string classification module to classify the character string into the character string suitable for training the long-short term memory neural network and training a convergent long-short term memory neural network model;
the password segment generation module is used for generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model trained by the model training module;
the probability mapping module is used for mapping the probability corresponding to the password segment into new probability according to a mapping function and sequencing the password segment according to the new probability in a descending order;
and the password generation module is used for generating passwords sorted in descending order according to the probability by utilizing the basic terminal and the probability in the probability context-free grammar generated by the grammar generation module, and the character string classification module classifies the characters which are not suitable for training the long-short term memory neural network, the probability and the password section and the new probability which are mapped by the probability mapping module.
In summary, the advantages and positive effects of the invention are: the invention uses a probability context-free grammar to identify the composition structure and semantic information in the password, and uses a long-short term memory neural network to model part of password segments so as to generate the password segments which do not appear in a training set. Compared with the prior art, the probability context-free grammar is used for overcoming the defects that the long-term and short-term memory neural network cannot identify the composition structure and semantic information in the password and the interpretability is poor; the defect of poor generalization capability of the probabilistic context-free grammar is overcome by utilizing the long-term and short-term memory neural network; the problem that the probability of generating the password segment by the long-term and short-term memory neural network is low is solved by utilizing probability mapping, and the quality of the password guess set is improved.
Drawings
FIG. 1 is a schematic structural diagram of a password guess set generating system according to an embodiment of the present invention;
in the figure: 1. a grammar generation module; 2. a character string classification module; 3. a model training module; 4. a password segment generation module; 5. a probability mapping module; 6. and a password generation module.
Fig. 2 is a flowchart of a password guess set generating method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a grammar generation module provided in an embodiment of the present invention.
Fig. 4 is a schematic diagram of a password generation module according to an embodiment of the present invention.
FIG. 5 is a diagram comparing guesses of the prior art and the present invention provided by the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In view of the problems in the prior art, the present invention provides a system and a method for generating a guessing set of password, and the present invention will be described in detail with reference to the accompanying drawings.
As shown in fig. 1, the input of the password guess set generating system provided in the embodiment of the present invention is personal information of the user and a corresponding password, where the personal information of the user includes a user name, a mailbox, a name, a birth date, an identification number, a mobile phone number, and the like. The method specifically comprises the following steps:
the grammar generation module 1 is used for generating a probability context-free grammar based on personal information and a password database;
the character string classification module 2 is used for classifying character strings in the probability context-free grammar generated by the grammar generation module 1 into character strings suitable for training the long-short term memory neural network and character strings unsuitable for training the long-short term memory neural network according to classification rules;
the model training module 3 is used for training the long-short term memory neural network by using the character string classification module 2 to classify the character string into the character string suitable for training the long-short term memory neural network, and training a convergent long-short term memory neural network model;
the password segment generation module 4 is used for generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model trained by the model training module 3;
the probability mapping module 5 is used for mapping the probability corresponding to the password segment into new probability according to a mapping function and sequencing the password segment according to the new probability in a descending order;
and the password generation module 6 is used for generating passwords which are sorted in descending order according to the probability by utilizing the basic terminal and the probability in the probability context-free grammar generated by the grammar generation module 1, and the character string classification module 2 classifies character strings which are not suitable for training the long-short term memory neural network, the probability and the password sections and the new probability after being mapped by the probability mapping module 5.
As shown in fig. 2, the password guess set generating method provided by the embodiment of the present invention includes the following steps:
s201: the grammar generation module generates a probability context-free grammar based on personal information and a password database; and the character string classification module classifies the character strings in the probability context-free grammar into character strings which are suitable or not suitable for training the long-term and short-term memory neural network according to classification rules.
S202: the model training module is used for training a converged long-term and short-term memory neural network model; the password segment generation module generates a password segment and a probability corresponding to the password segment using the converged long-short term memory neural network model.
S203: the probability mapping module maps the probability corresponding to the password segment into new probability and sorts the password segment in descending order according to the new probability; the password generation module generates passwords sorted in descending order of probability.
The technical solution of the present invention is further described below with reference to the accompanying drawings.
The password guess set generation method provided by the embodiment of the invention is based on the probability context-free grammar and the long-short term memory neural network, and comprises the following steps:
(1) as shown in fig. 3, the grammar generation module 1 matches, divides and marks the personal information patterns with the longest prefix matching algorithm according to the personal information patterns and the passwords in the password database, respectively marks the parts which do not match with the personal information as L, D and S according to the letters, numbers or special symbols, and uses subscripts to indicate the length of the parts, then calculates the probability of each character string in each pattern, and finally sorts the character strings of the basic terminal and the same pattern in descending order according to the probability to generate the probability context-free grammar based on the personal information and the password database.
Wherein the personal information mode includes all modes shown in table 1;
the matching, dividing and marking of the patterns means that the longest prefix matching algorithm is used for matching, then the password is divided into segments, and each segment is processed according to the matched patternRow marks, e.g. the 2 nd password "ls 19850102" in FIG. 3 can be matched to both "A" and "A1B4", may also match to" N15B1", but since" ls1985 "is longer than the length of" ls ", matching" ls1985 "to" A1"rather than matching" ls "to" N15”。
The probability of the personal information pattern is 1 because the personal information in the pattern is fixed for a certain person and is not possible for other character strings.
The modes are character strings of L, D and S, and the formula for calculating the probability of each character string in each mode is as follows:
where n is the number of occurrences of the character string and m is the number of occurrences of the pattern.
Descending sort may use any sort algorithm.
(2) The character string classification module 2 classifies the character strings in the probability context-free grammar generated in the step (1) according to classification rules, wherein the classification rules are as follows: if the mode of the character string is a personal information mode, the character string is a character string which is not suitable for training the long-term and short-term memory neural network; if the character string mode is L, D, S mode and the length is less than or equal to 4, the character string is not suitable for training the long-short term memory neural network; if the string pattern is L, D, S pattern and the length is greater than 4, the string is a string suitable for training the long-short term memory neural network.
(3) The model training module 3 trains a long-short term memory neural network model: and (3) training the long-short term memory neural network by using the character strings suitable for training the long-short term memory neural network in the step (2), and training a converged long-short term memory neural network model.
For each character string suitable for training the long-short term memory neural network, the characters of the character string at different moments are sequentially transmitted into an input layer of the long-short term memory neural network, and the output of the long-short term memory neural network is a prediction of the character at the next moment in the character string.
The loss function used in training the long-short term memory neural network is a cross entropy loss function.
The optimization algorithm used in training the long-short term memory neural network is Adam or other adaptive learning rate method.
Convergence refers to the parameters of the long-short term memory neural network such that the value of the loss function mathematically converges to a certain value.
(4) And (4) generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model generated in the step (3).
(5) And the probability mapping module 5 is used for mapping the probability of the password segment with the probability larger than a certain threshold value generated in the step (4) into a new probability according to a mapping function, and sequencing the password segments according to a descending order of the new probability, wherein the mapping function is as follows:
wherein p isOld ageSome probability of finger (4) generation is greater than pThreshold valueProbability of password segment of (1), Σ piAll probabilities of finger (4) generation being greater than pThreshold valueOf the password segment, pThreshold valueThe choice is free according to the size of the guess set of passwords desired to be generated.
(6) As shown in FIG. 4, the password generation module 6 generates passwords and probabilities (31 in FIG. 4) by using (1) the basic terminals and probabilities (31 in FIG. 4) in the generated probabilistic context-free grammar, (2) the passwords and probabilities (33 in FIG. 4) classified as inappropriate for training the long-short term memory neural network and (5) the mapped password segments and new probabilities, and generates the passwords and probabilities in descending order of probabilities by using a next function, wherein the passwords are generated by replacing the personal information patterns in the basic terminals one by one with the corresponding personal information (34 in FIG. 4) of the corresponding patterns of guessing targets, the patterns of which the patterns are L, D or S and the subscript is less than or equal to 4, the patterns of which the patterns are L, D or S and the subscript is greater than 4 are replaced with the corresponding strings in the probabilistic context-free grammar, and the patterns of which the patterns are L, D or S and the subscript is greater than 4 areThe probability of the password segment is calculated by multiplying the probability of the basic terminal by the probability of the corresponding character string which is not suitable for training the long-short term memory neural network or the new probability of the password segment after being mapped by the probability mapping module, for example, the calculation process of the last password probability in the 35 units in FIG. 4 is
TABLE 1
The invention is further described below in conjunction with specific experiments/data.
Taking personal information and password data set revealed by 12306 as an example, the data set contains 131653 pieces of data, 65825 pieces of data are randomly selected to be used as training set, the remaining 65828 pieces of data are used as test set, and training is respectively carried out by using the existing password guess set generation technology and the system and method provided by the invention according to the personal information and the password in the training set to obtain corresponding password guess sets, and the number of coverage of each password guess set to the password in the test set is shown in fig. 5.
As can be seen from FIG. 5, the password guess set generated by the system and method (Hybrid in FIG. 5) of the present invention can cover more passwords in the test set under the same size, which means that the password guess set generated by the system and method of the present invention is more in line with the habit of people to construct passwords, thereby helping people reduce the use of weak passwords; the system helps people to understand the ability of guessing attackers by passwords more deeply, thereby thinking of a way to avoid; the system administrator is helped to understand the safety and the usability of the password more deeply and is helped to prevent the weak password from being used; the password guessing speed is increased, and better technical guarantee is provided for digital evidence obtaining.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (8)
1. A password guess set generating method, characterized in that the password guess set generating method comprises the steps of:
firstly, generating a probability context-free grammar based on personal information and a password database; dividing character strings in the probability context-free grammar into character strings which are suitable or not suitable for training a long-term and short-term memory neural network according to a classification rule;
secondly, training a convergent long-short term memory neural network model; generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model;
thirdly, mapping the probability corresponding to the password segment into a new probability, and sequencing the password segment according to the new probability in a descending order; passwords are generated that are sorted in descending order of probability.
2. The password guess set generating method as claimed in claim 1, wherein the first step matches, divides and marks the personal information pattern with the password in the personal information and password database using a longest prefix matching algorithm according to the personal information pattern, the parts not matching the personal information are marked L, D and S respectively according to whether they are letters, numbers or special symbols, and the length of the parts is represented by subscripts, then the probability of each character string in each pattern is calculated, and finally the character strings of the basic terminal and the same pattern are sorted in descending order of probability to generate the probability context-free grammar based on the personal information and password database.
3. The password guess set generating method as in claim 2, wherein said personal information pattern includes;
the matching, dividing and marking of the patterns means that a longest prefix matching algorithm is used for matching, then the password is divided into sections, and each section is marked according to the matched pattern;
the probability of the personal information mode is 1, and the personal information is fixed;
the modes are character strings of L, D and S, and the formula for calculating the probability of each character string in each mode is as follows:
where n is the number of occurrences of the character string and m is the number of occurrences of the pattern.
4. The password guess set generating method of claim 1, where the first step classifies the character strings in the generated probabilistic context free grammar according to classification rules that are: if the mode of the character string is a personal information mode, the character string is a character string which is not suitable for training the long-term and short-term memory neural network; if the character string mode is L, D, S mode and the length is less than or equal to 4, the character string is not suitable for training the long-short term memory neural network; if the string pattern is L, D, S pattern and the length is greater than 4, the string is a string suitable for training the long-short term memory neural network.
5. The password guess set generating method as recited in claim 1, wherein said second step sequentially transfers the characters of different time instants of each character string suitable for training the long-short term memory neural network into the input layer of the long-short term memory neural network, and the output of the long-short term memory neural network is a prediction of the character of the next time instant in the character string;
the loss function used in training the long-term and short-term memory neural network is a cross entropy loss function;
the optimization algorithm used in training the long-short term memory neural network is Adam or other self-adaptive learning rate methods;
convergence refers to the parameters of the long-short term memory neural network such that the value of the loss function mathematically converges to a certain value.
6. The password guess set generating method of claim 1, wherein said third step maps the probabilities of password segments having a generated probability greater than a certain threshold to new probabilities by a mapping function and sorts the password segments in descending order of the new probabilities, the mapping function being:
wherein p isOld ageSome probability of finger generation is greater than pThreshold valueProbability of password segment of (1), Σ piAll probabilities of finger generation are greater than pThreshold valueOf the password segment, pThreshold valueThe choice is free according to the size of the guess set of passwords desired to be generated.
7. The password guess set generating method of claim 1, wherein said third step generates the passwords in descending order of probability using the basic terminals and probabilities in the generated probabilistic context free grammar; generating passwords and probabilities by classifying the passwords and the probabilities into character strings and character segments which are not suitable for training the long-short term memory neural network and mapped password segments and new probabilities, and generating the passwords sorted in descending order of the probabilities by using a next function, wherein the passwords are generated by replacing personal information modes in a basic terminal one by one into corresponding personal information of corresponding modes of guessing targets, the modes are L, D or S and have subscripts less than or equal to 4 and are replaced into corresponding character strings in a probability context-free grammar, the modes are L, D or S and have subscripts greater than 4 and are replaced into the password segments mapped by a probability mapping module, and the probability is calculated by multiplying the probability of the basic terminal by the probability of the corresponding character strings which are not suitable for training the long-short term memory neural network or the new probability of the password segments mapped by the probability mapping module; the last password probability is calculated as p (456 < ainiya) < p (D)3S1L6)*p(D3→″456″)*p(S1→″$″)*p(L6→ainiya)=0.17*0.33*0.33*0.08=0.00148104。
8. A password guess set generating system for implementing the password guess set generating method as recited in any one of claims 1 to 7, said password guess set generating system comprising:
the grammar generation module is used for generating a probability context-free grammar based on personal information and a password database;
the character string classification module is used for classifying character strings in the probability context-free grammar generated by the grammar generation module into character strings suitable for training the long and short term memory neural network and character strings not suitable for training the long and short term memory neural network according to classification rules;
the model training module is used for training the long-short term memory neural network by using the character string classification module to classify the character string into the character string suitable for training the long-short term memory neural network and training a convergent long-short term memory neural network model;
the password segment generation module is used for generating a password segment and a probability corresponding to the password segment by using the converged long-short term memory neural network model trained by the model training module;
the probability mapping module is used for mapping the probability corresponding to the password segment into new probability according to a mapping function and sequencing the password segment according to the new probability in a descending order;
and the password generation module is used for generating passwords sorted in descending order according to the probability by utilizing the basic terminal and the probability in the probability context-free grammar generated by the grammar generation module, and the character string classification module classifies the characters which are not suitable for training the long-short term memory neural network, the probability and the password section and the new probability which are mapped by the probability mapping module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010033647.XA CN111241534A (en) | 2020-01-13 | 2020-01-13 | Password guess set generation system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010033647.XA CN111241534A (en) | 2020-01-13 | 2020-01-13 | Password guess set generation system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111241534A true CN111241534A (en) | 2020-06-05 |
Family
ID=70877689
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010033647.XA Pending CN111241534A (en) | 2020-01-13 | 2020-01-13 | Password guess set generation system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111241534A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257433A (en) * | 2020-12-23 | 2021-01-22 | 四川大学 | Password dictionary generation method and system based on Markov chain and neural network |
CN112613325A (en) * | 2021-01-04 | 2021-04-06 | 上海交通大学 | Password semantic structuralization realization method based on deep learning |
CN112861528A (en) * | 2021-01-19 | 2021-05-28 | 复旦大学 | Markov password recovery method based on password internal semantic driving |
CN113886784A (en) * | 2021-12-06 | 2022-01-04 | 华南理工大学 | Password guessing method for improving guessing efficiency of small training set based on corpus |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180075348A1 (en) * | 2016-09-09 | 2018-03-15 | Cylance Inc. | Machine learning model for analysis of instruction sequences |
CN107947921A (en) * | 2017-11-22 | 2018-04-20 | 上海交通大学 | Based on recurrent neural network and the password of probability context-free grammar generation system |
CN108763920A (en) * | 2018-05-23 | 2018-11-06 | 四川大学 | A kind of password strength assessment model based on integrated study |
CN110334488A (en) * | 2019-06-14 | 2019-10-15 | 北京大学 | User authentication password security appraisal procedure and device based on Random Forest model |
-
2020
- 2020-01-13 CN CN202010033647.XA patent/CN111241534A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180075348A1 (en) * | 2016-09-09 | 2018-03-15 | Cylance Inc. | Machine learning model for analysis of instruction sequences |
CN107947921A (en) * | 2017-11-22 | 2018-04-20 | 上海交通大学 | Based on recurrent neural network and the password of probability context-free grammar generation system |
CN108763920A (en) * | 2018-05-23 | 2018-11-06 | 四川大学 | A kind of password strength assessment model based on integrated study |
CN110334488A (en) * | 2019-06-14 | 2019-10-15 | 北京大学 | User authentication password security appraisal procedure and device based on Random Forest model |
Non-Patent Citations (3)
Title |
---|
MATT WEIR ETC.: "Password Cracking Using Probabilistic Context-Free Grammars", 《IEEE》 * |
宋创创等: "基于集成学习的口令强度评估模型", 《计算机应用》 * |
王星星: "基于个人信息的口令猜测技术研究与系统实现", 《硕士电子期刊》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257433A (en) * | 2020-12-23 | 2021-01-22 | 四川大学 | Password dictionary generation method and system based on Markov chain and neural network |
CN112257433B (en) * | 2020-12-23 | 2021-05-14 | 四川大学 | Password dictionary generation method and system based on Markov chain and neural network |
CN112613325A (en) * | 2021-01-04 | 2021-04-06 | 上海交通大学 | Password semantic structuralization realization method based on deep learning |
CN112861528A (en) * | 2021-01-19 | 2021-05-28 | 复旦大学 | Markov password recovery method based on password internal semantic driving |
CN113886784A (en) * | 2021-12-06 | 2022-01-04 | 华南理工大学 | Password guessing method for improving guessing efficiency of small training set based on corpus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111241534A (en) | Password guess set generation system and method | |
CN109977416B (en) | Multi-level natural language anti-spam text method and system | |
CN110347835B (en) | Text clustering method, electronic device and storage medium | |
CN109887484A (en) | A kind of speech recognition based on paired-associate learning and phoneme synthesizing method and device | |
CN113722483B (en) | Topic classification method, device, equipment and storage medium | |
CN107908642B (en) | Industry text entity extraction method based on distributed platform | |
CN113297366B (en) | Emotion recognition model training method, device, equipment and medium for multi-round dialogue | |
CN111680161B (en) | Text processing method, equipment and computer readable storage medium | |
CN112052331A (en) | Method and terminal for processing text information | |
CN111709223B (en) | Sentence vector generation method and device based on bert and electronic equipment | |
CN113821587A (en) | Text relevance determination method, model training method, device and storage medium | |
Jami et al. | Biometric template protection through adversarial learning | |
CN115730597A (en) | Multi-level semantic intention recognition method and related equipment thereof | |
CN113220828B (en) | Method, device, computer equipment and storage medium for processing intention recognition model | |
Wang et al. | Sin: Semantic inference network for few-shot streaming label learning | |
CN111291078B (en) | Domain name matching detection method and device | |
CN115730237B (en) | Junk mail detection method, device, computer equipment and storage medium | |
CN116739067A (en) | Method, device, equipment and storage medium for learning few-sample model | |
CN115512693B (en) | Audio recognition method, acoustic model training method, device and storage medium | |
US11475069B2 (en) | Corpus processing method, apparatus and storage medium | |
CN112069392B (en) | Method and device for preventing and controlling network-related crime, computer equipment and storage medium | |
CN113918696A (en) | Question-answer matching method, device, equipment and medium based on K-means clustering algorithm | |
Du et al. | Combating word-level adversarial text with robust adversarial training | |
Zhang et al. | Filtering algorithm of spam short messages based on artificial immune system | |
CN110705275A (en) | Theme word extraction method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20230707 |