CN111191008A - Password guessing method based on numerical factor reverse order - Google Patents
Password guessing method based on numerical factor reverse order Download PDFInfo
- Publication number
- CN111191008A CN111191008A CN201911407189.5A CN201911407189A CN111191008A CN 111191008 A CN111191008 A CN 111191008A CN 201911407189 A CN201911407189 A CN 201911407189A CN 111191008 A CN111191008 A CN 111191008A
- Authority
- CN
- China
- Prior art keywords
- password
- factor
- reverse order
- guessing
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000004140 cleaning Methods 0.000 claims abstract description 5
- 240000006890 Erythroxylum coca Species 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 235000008957 cocaer Nutrition 0.000 claims description 2
- ZPUCINDJVBIVPJ-LJISPDSOSA-N cocaine Chemical compound O([C@H]1C[C@@H]2CC[C@@H](N2C)[C@H]1C(=O)OC)C(=O)C1=CC=CC=C1 ZPUCINDJVBIVPJ-LJISPDSOSA-N 0.000 claims description 2
- 238000000556 factor analysis Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract 1
- IXKSXJFAGXLQOQ-XISFHERQSA-N WHWLQLKPGQPMY Chemical compound C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 IXKSXJFAGXLQOQ-XISFHERQSA-N 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention discloses a password guessing method based on the reverse order of numerical factors, which comprises a data cleaning stage, a password factor splitting stage and a password guessing stage; wherein, password data set preprocessing: collecting different password sets, and cleaning and encoding data of the password sets; password factor splitting: segmenting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content contained in each factor; password guess set generation: and setting the number of generated passwords based on the occurrence frequency of each password factor obtained in the password factor analysis stage, and generating a password guess set. The invention analyzes the password factor by splitting and analyzing the digital factor in the reverse order, the experimental effect shows that the accuracy of the password guessed in the reverse order is better than that of the password guessed in the forward order, and an off-line password dictionary library can be generated by using the method for off-line password guessing.
Description
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a password guessing method based on a digit factor reverse order.
Background
Today, various user identity authentication modes are developed, password authentication is a mainstream mode of network user identity authentication with its simplicity and easiness, and still in the foreseeable future, with the continuous occurrence of data leakage events, the research on password guessing is increasing, and the environment according to password guessing is generally divided into online password guessing and offline password guessing. Online password guessing requires interaction with a network authentication server to obtain access to the service system by submitting a password set by a user to the server. A typical authentication server limits the number of authentications of a user's password for a period of time in order to prevent brute force guessing of the algorithm, while the server sets some verification codes to prevent the machine from successfully authenticating the system by making constant attempts. And the off-line password guessing comprises the steps of firstly obtaining the encrypted password file, then carrying out hash operation on the passwords of the existing dictionary library one by one, and comparing the hash operation with the hash operation of the target password. Compared with online password guessing, offline password guessing has a looser computing environment, and in recent years, offline password guessing also evolves from the original random and purposeless algorithm based on simple transformation to a system method based on a password frequency model, and the guessing efficiency is improved, so that more offline password guessing methods are proposed, but the traditional password guessing methods are all from front to back, no one analyzes from back to front, but when the numerical factors in the passwords are analyzed, the analysis of the numerical factors from back to front is more accurate than the analysis from front to back.
Disclosure of Invention
The invention aims to provide a password guessing method based on the numerical factor reverse order, which can increase the accuracy and efficiency of password guessing and guide the generation of an off-line password dictionary.
The specific technical scheme for realizing the purpose of the invention is as follows:
a password guessing method based on numerical factor reverse order comprises the following specific steps:
step 1: password dataset preprocessing
Collecting different password sets, and cleaning data of the password sets to finish data preprocessing of the password sets;
step 2: password factor splitting
Respectively splitting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content occurrence and the frequency of combination among the factors contained in each factor;
and step 3: password guess set generation
And setting the number of generated passwords and generating a password guess set based on the occurrence frequency of each password factor and the combination frequency among the factors obtained in the password factor splitting step.
The password set data preprocessing comprises the following steps: and eliminating invalid passwords and carrying out one-hot coding on the password data set.
The step 2 specifically comprises:
step A1: splitting the password into a number factor, a letter factor and a special character factor;
step A2: guessing numbers in a reverse order, and counting the occurrence frequency of the numbers;
step A3: extracting semantic matching letter factors from the Chinese and English corpus, and counting the occurrence frequency of the letter factors;
step A4: and counting the occurrence frequency of the special characters.
The reverse order guessing numbers adopt PCFG, Markov or neural network methods.
The English language database is a COCA database, and the Chinese language database is a souguo database.
The invention aims to make up the defects of the existing password guessing technology, divides the password into a number factor, an alphabet factor and a special character factor, carries out reverse order analysis on the number factor, matches the alphabet factor with the semantics in a Chinese and English corpus, and carries out statistics on the special character according to the occurrence frequency, thereby increasing the accuracy and efficiency of password guessing and guiding the generation of an off-line password dictionary.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following specific examples and the accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.
Examples
The technical terms in this example represent the following meanings:
digit factor contained in password
L: alpha factor contained in password
S: special character factor contained in password
L _ n is the first n pieces with the highest frequency of letter factor in password
n, setting the number of password guesses in the password guess set to generate the password.
As shown in fig. 1, the present embodiment includes the following three stages:
the first stage is as follows: password data set preprocessing: cleaning and rejecting the collected 12306 password data set
All invalid passwords, and then coding password data;
and a second stage: the password factor is split, with the password "password 123123! "is an example: the password is split into a number factor "123123", an alphabet factor "password" and a special character factor "!" according to the number D, the letter L and the special character S, respectively:
1) and (3) adopting a reverse order guessing method for the numbers: if the first three digits are "123" in the order of the positive digits and the frequency of the appearance of the digits, the last three digits may be "123" or "456" according to the habit of the user setting the password, but the last three digits are "123" in the order of the negative digits, and the frequency of the appearance of "123" in the first three digits is the greatest.
2) The letter corpus is matched. Because 12306 is a Chinese password library, the souguo corpus is selected for matching, the counted occurrence frequency of the 'password' word is added with 1, and L _ n is updated, namely the first n with the highest occurrence frequency of the letter factors;
3) count the Special character! And (4) updating the occurrence frequency and the occurrence frequency ranking of each special character.
And a third stage: password guess set generation: and setting the number n of generated passwords based on the password factor occurrence frequency and the password factor combination frequency obtained in the password factor analysis stage, and generating a password guess set.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected.
Claims (5)
1. A password guessing method based on the numerical factor reverse order is characterized by comprising the following specific steps:
step 1: password dataset preprocessing
Collecting different password sets, and cleaning data of the password sets to finish data preprocessing of the password sets;
step 2: password factor splitting
Respectively splitting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content occurrence and the frequency of combination among the factors contained in each factor;
and step 3: password guess set generation
And setting the number of generated passwords and generating a password guess set based on the occurrence frequency of each password factor and the combination frequency among the factors obtained in the password factor splitting step.
2. The password guessing method based on the inverse numerical factor order of claim 1, wherein the password set data is preprocessed as: and eliminating invalid passwords and carrying out one-hot coding on the password data set.
3. The password guessing method based on the inverse numerical factor order as claimed in claim 1, wherein the step 2 specifically includes:
step A1: splitting the password into a number factor, a letter factor and a special character factor;
step A2: guessing numbers in a reverse order, and counting the occurrence frequency of the numbers;
step A3: extracting semantic matching letter factors from the Chinese and English corpus, and counting the occurrence frequency of the letter factors;
step A4: and counting the occurrence frequency of the special characters.
4. The password guessing method based on the digit factor reverse order according to claim 3, wherein the reverse order guess numbers adopts PCFG, Markov or neural network method.
5. The password guessing method based on the numerical factor reverse order as claimed in claim 3, wherein the English language corpus is a COCA language corpus and the Chinese language corpus is a souguo language corpus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911407189.5A CN111191008A (en) | 2019-12-31 | 2019-12-31 | Password guessing method based on numerical factor reverse order |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911407189.5A CN111191008A (en) | 2019-12-31 | 2019-12-31 | Password guessing method based on numerical factor reverse order |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111191008A true CN111191008A (en) | 2020-05-22 |
Family
ID=70706389
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911407189.5A Pending CN111191008A (en) | 2019-12-31 | 2019-12-31 | Password guessing method based on numerical factor reverse order |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111191008A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257433A (en) * | 2020-12-23 | 2021-01-22 | 四川大学 | Password dictionary generation method and system based on Markov chain and neural network |
CN112667979A (en) * | 2020-12-30 | 2021-04-16 | 网神信息技术(北京)股份有限公司 | Password generation method and device, password identification method and device, and electronic device |
CN113051873A (en) * | 2021-03-22 | 2021-06-29 | 中国人民解放军战略支援部队信息工程大学 | Lightweight password guessing dictionary generation method and device based on variational self-encoder |
CN113886784A (en) * | 2021-12-06 | 2022-01-04 | 华南理工大学 | Password guessing method for improving guessing efficiency of small training set based on corpus |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013109330A2 (en) * | 2011-10-31 | 2013-07-25 | The Florida State University Research Foundation, Inc. | System and methods for analyzing and modifying passwords |
CN106570391A (en) * | 2016-11-10 | 2017-04-19 | 中国科学院信息工程研究所 | Memory block based password guessing set generation method and memory block based digital password cracking method |
CN109829289A (en) * | 2019-01-09 | 2019-05-31 | 中国电子科技集团公司电子科学研究院 | Password guess method |
CN110472385A (en) * | 2018-05-10 | 2019-11-19 | 深圳市格瑞信息科技有限公司 | A kind of password cracking method and device |
-
2019
- 2019-12-31 CN CN201911407189.5A patent/CN111191008A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013109330A2 (en) * | 2011-10-31 | 2013-07-25 | The Florida State University Research Foundation, Inc. | System and methods for analyzing and modifying passwords |
CN106570391A (en) * | 2016-11-10 | 2017-04-19 | 中国科学院信息工程研究所 | Memory block based password guessing set generation method and memory block based digital password cracking method |
CN110472385A (en) * | 2018-05-10 | 2019-11-19 | 深圳市格瑞信息科技有限公司 | A kind of password cracking method and device |
CN109829289A (en) * | 2019-01-09 | 2019-05-31 | 中国电子科技集团公司电子科学研究院 | Password guess method |
Non-Patent Citations (2)
Title |
---|
周环;刘奇旭;崔翔;张方娇;: "基于神经网络的定向口令猜测研究" * |
高强;李啸;胡勇;吴少华;: "基于社工信息的口令生成与安全性分析" * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257433A (en) * | 2020-12-23 | 2021-01-22 | 四川大学 | Password dictionary generation method and system based on Markov chain and neural network |
CN112667979A (en) * | 2020-12-30 | 2021-04-16 | 网神信息技术(北京)股份有限公司 | Password generation method and device, password identification method and device, and electronic device |
CN113051873A (en) * | 2021-03-22 | 2021-06-29 | 中国人民解放军战略支援部队信息工程大学 | Lightweight password guessing dictionary generation method and device based on variational self-encoder |
CN113886784A (en) * | 2021-12-06 | 2022-01-04 | 华南理工大学 | Password guessing method for improving guessing efficiency of small training set based on corpus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111191008A (en) | Password guessing method based on numerical factor reverse order | |
CN108984530B (en) | Detection method and detection system for network sensitive content | |
CN107391486B (en) | Method for identifying new words in field based on statistical information and sequence labels | |
CN109241523B (en) | Method, device and equipment for identifying variant cheating fields | |
CN110297988A (en) | Hot topic detection method based on weighting LDA and improvement Single-Pass clustering algorithm | |
EP0849688A2 (en) | System and method for natural language determination | |
CN111079412A (en) | Text error correction method and device | |
WO2005064490A1 (en) | System for recognising and classifying named entities | |
CN109993216B (en) | Text classification method and device based on K nearest neighbor KNN | |
CN110489997A (en) | A kind of sensitive information desensitization method based on pattern matching algorithm | |
CN101308512B (en) | Mutual translation pair extraction method and device based on web page | |
CN110597844A (en) | Heterogeneous database data unified access method and related equipment | |
CN112883734A (en) | Block chain security event public opinion monitoring method and system | |
CN112270191A (en) | Method and device for extracting work order text theme | |
CN111797217A (en) | Information query method based on FAQ matching model and related equipment thereof | |
CN110457707B (en) | Method and device for extracting real word keywords, electronic equipment and readable storage medium | |
Cheng et al. | Improved probabilistic context-free grammars for passwords using word extraction | |
Gueddah et al. | The impact of Arabic inter-character proximity and similarity on spell-checking | |
CN110991169A (en) | Method and device for identifying risk content variety and electronic equipment | |
Hassanat et al. | Rule-and dictionary-based solution for variations in written Arabic names in social networks, big data, accounting systems and large databases | |
CN109885829A (en) | A kind of word-based password intensity evaluation method | |
CN115659017A (en) | Sensitive word matching method, device, equipment, storage medium and product | |
CN115114614A (en) | Password guessing method based on special characters | |
CN112559694B (en) | Method and device for discovering new words, computer storage medium and electronic equipment | |
CN112632526B (en) | User password modeling and strength evaluation method based on comprehensive segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200522 |