CN111191008A - Password guessing method based on numerical factor reverse order - Google Patents

Password guessing method based on numerical factor reverse order Download PDF

Info

Publication number
CN111191008A
CN111191008A CN201911407189.5A CN201911407189A CN111191008A CN 111191008 A CN111191008 A CN 111191008A CN 201911407189 A CN201911407189 A CN 201911407189A CN 111191008 A CN111191008 A CN 111191008A
Authority
CN
China
Prior art keywords
password
factor
reverse order
guessing
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911407189.5A
Other languages
Chinese (zh)
Inventor
何道敬
周贝贝
陆城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201911407189.5A priority Critical patent/CN111191008A/en
Publication of CN111191008A publication Critical patent/CN111191008A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a password guessing method based on the reverse order of numerical factors, which comprises a data cleaning stage, a password factor splitting stage and a password guessing stage; wherein, password data set preprocessing: collecting different password sets, and cleaning and encoding data of the password sets; password factor splitting: segmenting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content contained in each factor; password guess set generation: and setting the number of generated passwords based on the occurrence frequency of each password factor obtained in the password factor analysis stage, and generating a password guess set. The invention analyzes the password factor by splitting and analyzing the digital factor in the reverse order, the experimental effect shows that the accuracy of the password guessed in the reverse order is better than that of the password guessed in the forward order, and an off-line password dictionary library can be generated by using the method for off-line password guessing.

Description

Password guessing method based on numerical factor reverse order
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a password guessing method based on a digit factor reverse order.
Background
Today, various user identity authentication modes are developed, password authentication is a mainstream mode of network user identity authentication with its simplicity and easiness, and still in the foreseeable future, with the continuous occurrence of data leakage events, the research on password guessing is increasing, and the environment according to password guessing is generally divided into online password guessing and offline password guessing. Online password guessing requires interaction with a network authentication server to obtain access to the service system by submitting a password set by a user to the server. A typical authentication server limits the number of authentications of a user's password for a period of time in order to prevent brute force guessing of the algorithm, while the server sets some verification codes to prevent the machine from successfully authenticating the system by making constant attempts. And the off-line password guessing comprises the steps of firstly obtaining the encrypted password file, then carrying out hash operation on the passwords of the existing dictionary library one by one, and comparing the hash operation with the hash operation of the target password. Compared with online password guessing, offline password guessing has a looser computing environment, and in recent years, offline password guessing also evolves from the original random and purposeless algorithm based on simple transformation to a system method based on a password frequency model, and the guessing efficiency is improved, so that more offline password guessing methods are proposed, but the traditional password guessing methods are all from front to back, no one analyzes from back to front, but when the numerical factors in the passwords are analyzed, the analysis of the numerical factors from back to front is more accurate than the analysis from front to back.
Disclosure of Invention
The invention aims to provide a password guessing method based on the numerical factor reverse order, which can increase the accuracy and efficiency of password guessing and guide the generation of an off-line password dictionary.
The specific technical scheme for realizing the purpose of the invention is as follows:
a password guessing method based on numerical factor reverse order comprises the following specific steps:
step 1: password dataset preprocessing
Collecting different password sets, and cleaning data of the password sets to finish data preprocessing of the password sets;
step 2: password factor splitting
Respectively splitting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content occurrence and the frequency of combination among the factors contained in each factor;
and step 3: password guess set generation
And setting the number of generated passwords and generating a password guess set based on the occurrence frequency of each password factor and the combination frequency among the factors obtained in the password factor splitting step.
The password set data preprocessing comprises the following steps: and eliminating invalid passwords and carrying out one-hot coding on the password data set.
The step 2 specifically comprises:
step A1: splitting the password into a number factor, a letter factor and a special character factor;
step A2: guessing numbers in a reverse order, and counting the occurrence frequency of the numbers;
step A3: extracting semantic matching letter factors from the Chinese and English corpus, and counting the occurrence frequency of the letter factors;
step A4: and counting the occurrence frequency of the special characters.
The reverse order guessing numbers adopt PCFG, Markov or neural network methods.
The English language database is a COCA database, and the Chinese language database is a souguo database.
The invention aims to make up the defects of the existing password guessing technology, divides the password into a number factor, an alphabet factor and a special character factor, carries out reverse order analysis on the number factor, matches the alphabet factor with the semantics in a Chinese and English corpus, and carries out statistics on the special character according to the occurrence frequency, thereby increasing the accuracy and efficiency of password guessing and guiding the generation of an off-line password dictionary.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following specific examples and the accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.
Examples
The technical terms in this example represent the following meanings:
digit factor contained in password
L: alpha factor contained in password
S: special character factor contained in password
L _ n is the first n pieces with the highest frequency of letter factor in password
n, setting the number of password guesses in the password guess set to generate the password.
As shown in fig. 1, the present embodiment includes the following three stages:
the first stage is as follows: password data set preprocessing: cleaning and rejecting the collected 12306 password data set
All invalid passwords, and then coding password data;
and a second stage: the password factor is split, with the password "password 123123! "is an example: the password is split into a number factor "123123", an alphabet factor "password" and a special character factor "!" according to the number D, the letter L and the special character S, respectively:
1) and (3) adopting a reverse order guessing method for the numbers: if the first three digits are "123" in the order of the positive digits and the frequency of the appearance of the digits, the last three digits may be "123" or "456" according to the habit of the user setting the password, but the last three digits are "123" in the order of the negative digits, and the frequency of the appearance of "123" in the first three digits is the greatest.
2) The letter corpus is matched. Because 12306 is a Chinese password library, the souguo corpus is selected for matching, the counted occurrence frequency of the 'password' word is added with 1, and L _ n is updated, namely the first n with the highest occurrence frequency of the letter factors;
3) count the Special character! And (4) updating the occurrence frequency and the occurrence frequency ranking of each special character.
And a third stage: password guess set generation: and setting the number n of generated passwords based on the password factor occurrence frequency and the password factor combination frequency obtained in the password factor analysis stage, and generating a password guess set.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected.

Claims (5)

1. A password guessing method based on the numerical factor reverse order is characterized by comprising the following specific steps:
step 1: password dataset preprocessing
Collecting different password sets, and cleaning data of the password sets to finish data preprocessing of the password sets;
step 2: password factor splitting
Respectively splitting the password into a number factor, an alphabet factor and a special character factor according to numbers, letters and special characters, and counting the frequency of content occurrence and the frequency of combination among the factors contained in each factor;
and step 3: password guess set generation
And setting the number of generated passwords and generating a password guess set based on the occurrence frequency of each password factor and the combination frequency among the factors obtained in the password factor splitting step.
2. The password guessing method based on the inverse numerical factor order of claim 1, wherein the password set data is preprocessed as: and eliminating invalid passwords and carrying out one-hot coding on the password data set.
3. The password guessing method based on the inverse numerical factor order as claimed in claim 1, wherein the step 2 specifically includes:
step A1: splitting the password into a number factor, a letter factor and a special character factor;
step A2: guessing numbers in a reverse order, and counting the occurrence frequency of the numbers;
step A3: extracting semantic matching letter factors from the Chinese and English corpus, and counting the occurrence frequency of the letter factors;
step A4: and counting the occurrence frequency of the special characters.
4. The password guessing method based on the digit factor reverse order according to claim 3, wherein the reverse order guess numbers adopts PCFG, Markov or neural network method.
5. The password guessing method based on the numerical factor reverse order as claimed in claim 3, wherein the English language corpus is a COCA language corpus and the Chinese language corpus is a souguo language corpus.
CN201911407189.5A 2019-12-31 2019-12-31 Password guessing method based on numerical factor reverse order Pending CN111191008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911407189.5A CN111191008A (en) 2019-12-31 2019-12-31 Password guessing method based on numerical factor reverse order

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911407189.5A CN111191008A (en) 2019-12-31 2019-12-31 Password guessing method based on numerical factor reverse order

Publications (1)

Publication Number Publication Date
CN111191008A true CN111191008A (en) 2020-05-22

Family

ID=70706389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911407189.5A Pending CN111191008A (en) 2019-12-31 2019-12-31 Password guessing method based on numerical factor reverse order

Country Status (1)

Country Link
CN (1) CN111191008A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257433A (en) * 2020-12-23 2021-01-22 四川大学 Password dictionary generation method and system based on Markov chain and neural network
CN112667979A (en) * 2020-12-30 2021-04-16 网神信息技术(北京)股份有限公司 Password generation method and device, password identification method and device, and electronic device
CN113051873A (en) * 2021-03-22 2021-06-29 中国人民解放军战略支援部队信息工程大学 Lightweight password guessing dictionary generation method and device based on variational self-encoder
CN113886784A (en) * 2021-12-06 2022-01-04 华南理工大学 Password guessing method for improving guessing efficiency of small training set based on corpus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013109330A2 (en) * 2011-10-31 2013-07-25 The Florida State University Research Foundation, Inc. System and methods for analyzing and modifying passwords
CN106570391A (en) * 2016-11-10 2017-04-19 中国科学院信息工程研究所 Memory block based password guessing set generation method and memory block based digital password cracking method
CN109829289A (en) * 2019-01-09 2019-05-31 中国电子科技集团公司电子科学研究院 Password guess method
CN110472385A (en) * 2018-05-10 2019-11-19 深圳市格瑞信息科技有限公司 A kind of password cracking method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013109330A2 (en) * 2011-10-31 2013-07-25 The Florida State University Research Foundation, Inc. System and methods for analyzing and modifying passwords
CN106570391A (en) * 2016-11-10 2017-04-19 中国科学院信息工程研究所 Memory block based password guessing set generation method and memory block based digital password cracking method
CN110472385A (en) * 2018-05-10 2019-11-19 深圳市格瑞信息科技有限公司 A kind of password cracking method and device
CN109829289A (en) * 2019-01-09 2019-05-31 中国电子科技集团公司电子科学研究院 Password guess method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周环;刘奇旭;崔翔;张方娇;: "基于神经网络的定向口令猜测研究" *
高强;李啸;胡勇;吴少华;: "基于社工信息的口令生成与安全性分析" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257433A (en) * 2020-12-23 2021-01-22 四川大学 Password dictionary generation method and system based on Markov chain and neural network
CN112667979A (en) * 2020-12-30 2021-04-16 网神信息技术(北京)股份有限公司 Password generation method and device, password identification method and device, and electronic device
CN113051873A (en) * 2021-03-22 2021-06-29 中国人民解放军战略支援部队信息工程大学 Lightweight password guessing dictionary generation method and device based on variational self-encoder
CN113886784A (en) * 2021-12-06 2022-01-04 华南理工大学 Password guessing method for improving guessing efficiency of small training set based on corpus

Similar Documents

Publication Publication Date Title
CN111191008A (en) Password guessing method based on numerical factor reverse order
CN108984530B (en) Detection method and detection system for network sensitive content
CN107391486B (en) Method for identifying new words in field based on statistical information and sequence labels
CN109241523B (en) Method, device and equipment for identifying variant cheating fields
CN110297988A (en) Hot topic detection method based on weighting LDA and improvement Single-Pass clustering algorithm
EP0849688A2 (en) System and method for natural language determination
CN111079412A (en) Text error correction method and device
CN109993216B (en) Text classification method and device based on K nearest neighbor KNN
CN110489997A (en) A kind of sensitive information desensitization method based on pattern matching algorithm
CN101308512B (en) Mutual translation pair extraction method and device based on web page
CN110597844A (en) Heterogeneous database data unified access method and related equipment
CN115186654B (en) Method for generating document abstract
CN112883734A (en) Block chain security event public opinion monitoring method and system
CN112270191A (en) Method and device for extracting work order text theme
CN111797217A (en) Information query method based on FAQ matching model and related equipment thereof
CN110457707B (en) Method and device for extracting real word keywords, electronic equipment and readable storage medium
Cheng et al. Improved probabilistic context-free grammars for passwords using word extraction
Gueddah et al. The impact of Arabic inter-character proximity and similarity on spell-checking
CN110991169A (en) Method and device for identifying risk content variety and electronic equipment
Hassanat et al. Rule-and dictionary-based solution for variations in written Arabic names in social networks, big data, accounting systems and large databases
CN109885829A (en) A kind of word-based password intensity evaluation method
CN115659017A (en) Sensitive word matching method, device, equipment, storage medium and product
CN115114614A (en) Password guessing method based on special characters
CN112559694B (en) Method and device for discovering new words, computer storage medium and electronic equipment
CN112632526B (en) User password modeling and strength evaluation method based on comprehensive segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200522