CN117786664B - Sample-induced password test method - Google Patents

Sample-induced password test method Download PDF

Info

Publication number
CN117786664B
CN117786664B CN202410205659.4A CN202410205659A CN117786664B CN 117786664 B CN117786664 B CN 117786664B CN 202410205659 A CN202410205659 A CN 202410205659A CN 117786664 B CN117786664 B CN 117786664B
Authority
CN
China
Prior art keywords
information
user
sample
password
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410205659.4A
Other languages
Chinese (zh)
Other versions
CN117786664A (en
Inventor
韩庆良
韩明军
张晓溪
史文征
于志波
刘梦云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dopp Information Technology Co ltd
Original Assignee
Dopp Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dopp Information Technology Co ltd filed Critical Dopp Information Technology Co ltd
Priority to CN202410205659.4A priority Critical patent/CN117786664B/en
Publication of CN117786664A publication Critical patent/CN117786664A/en
Application granted granted Critical
Publication of CN117786664B publication Critical patent/CN117786664B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Storage Device Security (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention relates to the technical field of password test, and discloses a sample-induced password test method, which comprises the following steps: collecting user personal information and preprocessing, and taking preprocessed user personal information data as a sample; information encoding is carried out on the samples to obtain sample vectors; and carrying out optimization solution on the constructed password sequence generation model, receiving a sample vector by utilizing the password sequence generation model obtained by the optimization solution, carrying out password sequence generation based on sample induction, and carrying out security test on the user password by utilizing the generated password sequence. According to the method, the similarity of the depth semantic features among the sample vectors of different information types is combined, the information data is combined, a password sequence is generated, intensity weight calculation is performed based on the distribution of characters in the real password, the safety test of the real password is performed by combining the vector similarity of the password in the password sequence and the real password, and the higher the safety test result is, the safer the real password of a user is indicated.

Description

Sample-induced password test method
Technical Field
The invention relates to the technical field of password testing, in particular to a sample-induced password testing method.
Background
Passwords are one of the important means of protecting personal privacy and sensitive information. However, the use of weak passwords and password cracking frequently occur, bringing a great risk to personal and organizational security. In order to increase the security of passwords, researchers have been working on developing password test methods to evaluate the strength and resistance to cracking of passwords. Traditional password testing methods are mainly based on mathematical models and statistical analysis, but the methods ignore the behavioral and psychological factors of users, so that the security of the real password cannot be comprehensively estimated. Aiming at the problem, the invention provides a sample-induced password test method, which improves the accuracy and practicality of password test by analyzing a real password sample, and can better understand the behavior and preference of a user for selecting a password, thereby improving the efficiency and accuracy of password test.
Disclosure of Invention
In view of the above, the present invention provides a sample-induced password test method, which aims to: 1) Acquiring personal information data of a user, respectively recoding and representing different types of information data, realizing segmentation and combination processing of different fields in the information data, further carrying out information coding on the recoded result to form a sample vector representing the user information, carrying out deep semantic feature extraction on the sample vector by using a code sequence generation model, combining the information data by combining the similarity of the deep semantic features among the sample vectors of different information types, generating a code sequence, and realizing code generation based on sample information induction; 2) The method comprises the steps of obtaining a real password of a user, carrying out intensity weight calculation based on the distribution of characters in the real password, carrying out security test on the real password by combining the vector similarity of the password in a password sequence and the real password, and obtaining a corresponding security test result, wherein the higher the security test result is, the safer the real password of the user is, otherwise, the greater possibility of password cracking by personal information of the user is indicated, and the password test based on sample induction is realized.
The invention provides a sample-induced password test method, which comprises the following steps:
s1: and collecting user personal information and preprocessing the user personal information to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample.
S2: and carrying out information coding on the samples to obtain sample vectors, wherein a coding algorithm integrating the BERT and the self-encoder is a main implementation method of the information coding.
S3: and constructing a code sequence generation model, and carrying out optimization solving on the constructed code sequence generation model, wherein the code sequence generation model comprises a generator and a resolver, the generator takes a sample vector as input, the deep semantic feature of the sample vector as output, the resolver takes the deep semantic feature as input, and the code sequence as output.
S4: and receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, generating a code sequence based on sample induction, and carrying out security test on a user code by using the generated code sequence.
As a further improvement of the present invention:
optionally, in the step S1, user personal information is collected and preprocessed, and the preprocessed user personal information data is taken as a sample, which includes:
Collecting personal information of a user, wherein the information representation form of the collected personal information of the user is as follows:
Wherein:
representing personal information of the user;
Identity information representing the user himself; the identity information comprises name, gender, age, birthday, identity card information, mobile phone number, mailbox name and login account name of website or software;
Identity information representing the immediate relatives of the user;
Self experience information representing a user; in the embodiment of the invention, the self experience information of the user comprises the academic experience information and the professional experience information of the user.
Personal information to the userPreprocessing to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample, wherein the user personal informationThe pretreatment flow of (2) is as follows:
s11: extracting identity information of a user from personal information X of the user And for identity informationRecoding name, birthday, ID card information, mailbox name and login account name, retaining other information, and forming ID information from the retained information and recoding resultIs a result of pretreatment of (a)
Wherein:
In turn, identity information Recoding results of name, gender, age, birthday, identification card information, mobile phone number, mailbox name and login account name of website or software, or reserved information.
The recoding rule of the identity information is as follows:
A1: extracting names in the identity information to perform information extraction coding, wherein the information extraction coding result comprises a full name, an acronym of the name, a surname, a surname+the acronym, a first acronym+the surname and a surname with an upper case, performing pinyin labeling on the information extraction coding result, and taking the pinyin labeling result as a recoding result of the names in the identity information.
A2: and extracting the birthday in the identity information for recoding, wherein the recoding result of the birthday comprises a year-month-day form date, a month-day-year form date, a day-month-year form date, a year and a day.
A3: and extracting the identity card information in the identity card information to perform information segmentation to obtain province-city-county codes in the identity card information, and taking the province-city-county codes as recoding results of the identity card information.
A4: and extracting the mailbox names in the identity information to perform information segmentation to obtain the mailbox names, the mailbox name letter prefixes and the mailbox name number prefixes, and taking the mailbox names, the mailbox name letter prefixes and the mailbox name number prefixes as recoding results of the mailbox names.
A5: extracting the login account name in the identity information to perform information segmentation, and obtaining the login account name, the letter prefix of the login account name and the number prefix of the login account name as recoding results of the login account name.
S12: extracting identity information of direct relatives of user in personal information X of userAnd for identity informationRecoding name, birthday, ID card information, mailbox name and login account name, retaining other information, and forming ID information from the retained information and recoding resultIs a result of pretreatment of (a)
Wherein:
In turn, identity information Recoding results of name, gender, age, birthday, identification card information, mobile phone number, mailbox name and login account name of website or software, or reserved information.
S13: self experience information for user in personal information X of userPerforming word segmentation processing, and sequencing the words after word segmentationAs a means ofIs a result of the pretreatment of (a); in the embodiment of the invention, the word segmentation processing mode is jieba word segmentation tools.
S14: constructing preprocessed user personal information dataPreprocessing the personal information data of the userAs a sample.
Optionally, the step S2 of encoding information on the samples includes:
and carrying out information coding on the samples to obtain sample vectors, wherein the information coding flow of the samples Y is as follows:
S21: and carrying out binary coding on the gender, age and mobile phone number of the identity information part in the sample.
S22: the recoding results of the name, the birthday, the identity card information, the mailbox name and the login account name of the identity information part in the sample are subjected to single-hot coding representation, the single-hot coding representation result is converted into a word vector form by utilizing a BERT model, the word vector is subjected to self-coding processing, and a self-coding result corresponding to each recoding result is obtained, wherein any recoding result is obtainedThe self-encoding formula of (2) is:
Wherein:
An exponential function that is based on a natural constant;
Representing arbitrary recoding results Is a self-encoding result of (a);
t represents a transpose;
Representing recoding results Is a single-hot encoding result;
Representing a convolution symbol;
representing a convolution parameter matrix;
Representing recoding results Is a convolution vector of (1);
A control parameter representing the length of the encoded vector;
Representing a self-encoding matrix;
Representing recoding results The recoding result set of the corresponding identity information type, wherein the identity information type comprises a name, a gender, an age, a birthday, identity card information, a mobile phone number, a mailbox name and a login account name of a website or software.
S23: extracting word segmentation sequences of self-experience information parts in the samples, carrying out single-heat coding representation on each word segmentation result, converting the single-heat coding representation result into a word vector form by utilizing a BERT model, carrying out self-coding processing on the word vector to obtain self-coding results corresponding to each word segmentation result, and forming the self-coding results of all the word segmentation results into the self-experience information part self-coding result.
S24: information encoding is carried out on the information data in the samples according to the steps S21 to S23 to obtain the encoding result of the information data, and a sample vector is formedWhereinRespectively isCorresponding self-coding result.
Optionally, constructing a password sequence generation model in the step S3 includes:
The method comprises the steps of constructing a code sequence generation model, wherein the code sequence generation model comprises a generator and a resolver, the generator takes a sample vector as input, the depth semantic feature of the sample vector as output, and the resolver takes the depth semantic feature as input and the code sequence as output.
Optionally, in the step S3, performing optimization solution on the constructed password sequence generation model, including:
and carrying out optimization solution on the constructed password sequence generation model, wherein the optimization solution flow of the password sequence generation model is as follows:
S31: collecting a training data set data of a U-group sample vector and a user password composition model:
Wherein:
Representing the acquired u-th set of sample vectors, Representing the u-th set of sample vectorsA corresponding user password;
s32: and inputting the sample vectors into a code sequence generation model to generate code sequences corresponding to each group of sample vectors.
S33: constructing and obtaining a training objective function
Wherein:
Representing information encoding processing, wherein the flow of the information encoding processing is step S2;
Representing a model parameter vector to be optimally solved; in the embodiment of the invention, the model parameter vector to be optimally solved is two groups of semantic feature extraction matrixes.
S34: initializing a model parameter vector to be optimally solvedSetting the current iteration number of the model parameter vector as t, the maximum iteration number as Max, the initial value of t as 0, and the t-th iteration result of the model parameter vector as follows
S35: calculating to obtain the iteration direction and the iteration step length of the model parameter vector:
Wherein:
Iteration step Satisfy the following requirementsRepresenting the step length of the t-th iteration;
Represents an L2 norm;
Representation of Is a gradient of (2);
Representing the iteration direction of the t-th iteration;
Is a parameter in the calculation process.
S36: iterating the model parameter vector:
let t=t+1, return to step S35 until reaching the maximum number of iterations, construct the cipher sequence generation model based on the model parameter vector obtained by the final iteration.
Optionally, in the step S4, a code sequence generating model obtained by using the optimization solution receives a sample vector, and generates a code sequence based on sample induction, including:
And receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, and generating a code sequence based on sample induction, wherein the code sequence generation flow is as follows:
S41: the generator receives the sample vector For sample vectors according to information typeDividing to obtain sample vector representing user identity information typeSample vector characterizing user immediate relative identity information typeSample vector characterizing the type of information experienced by the user himselfWhereinThe system sequentially comprises sample vectors for representing names, sexes, ages, birthdays, identity card information, mobile phone numbers, mailbox names and login account names of websites or software:
sequentially comprises sample vector representing name, gender, age, birthday, ID card information, mobile phone number, mailbox name and login account name of website or software
S42: depth semantic features are performed on different types of sample vectors respectively, wherein the sample vectorsSample vectorThe depth semantic feature extraction formula of (1) is respectively as follows:
Wherein:
Representing sample vectors Is a combination of the depth semantic features of (a),The self-coding result corresponding to the h word segmentation result in the self-experience information is represented, namely, a sample vector corresponding to the h word segmentation result in the self-experience information;
Representing sample vectors Is a depth semantic feature of (1);
Representing a semantic feature extraction matrix;
S43: the parser performs combined coding on information data corresponding to the sample vector based on the depth semantic features to generate a password sequence, wherein a combined coding formula is as follows:
Wherein:
Represents an L1 norm;
Representing depth semantic features Corresponding combined coding result, ifAbove a preset threshold, depth semantic features are determinedThe corresponding information data are arranged and combined to obtain a password in an induction way; in an embodiment of the present invention, in the present invention,
And representing the combined coding result corresponding to the plurality of depth semantic features.
S44: repeating the step S43, inducing to obtain N groups of passwords based on the sample vector, and generating a password sequence, wherein the representation form of the password sequence is as follows:
Wherein:
representing the generated nth set of passwords.
Optionally, in the step S4, the security test is performed on the user password by using the generated password sequence, including:
And carrying out security test on the user password by utilizing the generated password sequence, wherein the security test flow of the user password is as follows:
a user password pwd is received.
Calculating to obtain the intensity weight of the user password:
Wherein:
Representing the intensity weight of the user password pwd;
Indicating the number of characters in the user password pwd that are repeated;
representing the number of character types in the user password pwd, wherein the character types comprise numbers, letters and special symbols;
Representing the total number of characters in the user password pwd;
a set of characters representing the user password pwd, S represents a character setAny character in (a);
Indicating how frequently the character s appears in the user personal information X.
Based on the password sequence and the intensity weight, calculating to obtain a security test result of the user password pwd:
Wherein:
the method comprises the steps of representing a security test result of a user password pwd, wherein the higher the security test result is, the safer the user password pwd is, otherwise, the greater possibility of password cracking through personal information of the user is represented, and if the security test result is higher than a preset threshold value, the user password pwd passes the security test;
an information encoding process is represented, wherein the flow of the information encoding process is step S2.
In order to solve the above-described problems, the present invention provides an electronic apparatus including:
A memory storing at least one instruction.
The communication interface is used for realizing the communication of the electronic equipment; and
And the processor executes the instructions stored in the memory to realize the sample-induced password test method.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one instruction that is executed by a processor in an electronic device to implement the sample-induced password test method described above.
Compared with the prior art, the invention provides a sample-induced password test method, which has the following advantages:
Firstly, the scheme provides a sample information coding and password generation method, which obtains a sample vector by carrying out information coding on a sample, wherein the information coding flow of a sample Y is as follows: binary coding is carried out on the gender, age and mobile phone number of the identity information part in the sample; the recoding results of the name, the birthday, the identity card information, the mailbox name and the login account name of the identity information part in the sample are subjected to single-hot coding representation, the single-hot coding representation result is converted into a word vector form by utilizing a BERT model, the word vector is subjected to self-coding processing, and a self-coding result corresponding to each recoding result is obtained, wherein any recoding result is obtained The self-encoding formula of (2) is:
Wherein: An exponential function that is based on a natural constant; Representing arbitrary recoding results Is a self-encoding result of (a); t represents a transpose; Representing recoding results Is a single-hot encoding result; Representing a convolution symbol; representing a convolution parameter matrix; Representing recoding results Is a convolution vector of (1); A control parameter representing the length of the encoded vector; Representing a self-encoding matrix; Representing recoding results The recoding result set of the corresponding identity information type, wherein the identity information type comprises name, gender, age, birthday, identity card information, mobile phone number, mailbox name and login account name of website or software; extracting word segmentation sequences of self-experience information parts in the samples, carrying out single-heat coding representation on each word segmentation result, converting the single-heat coding representation result into a word vector form by utilizing a BERT model, carrying out self-coding processing on the word vector to obtain self-coding results corresponding to each word segmentation result, and forming the self-coding results of all the word segmentation results into the self-coding results of the self-experience information parts; information encoding is carried out on the information data in the samples to obtain an encoding result of the information data, and a sample vector is formedWhereinRespectively isCorresponding self-coding result. According to the scheme, personal information data of a user is obtained, recoding and representing are respectively carried out on information data of different types, segmentation and combination processing of different fields in the information data are achieved, information encoding is carried out on recoding results, sample vectors representing the user information are formed, deep semantic feature extraction is carried out on the sample vectors by utilizing a code sequence generation model, the similarity of the deep semantic features among the sample vectors of different information types is combined, the information data are combined, a code sequence is generated, and code generation based on sample information induction is achieved.
Meanwhile, the scheme provides a password security test method, which utilizes the generated password sequence to carry out security test on the user password, wherein the security test flow of the user password is as follows:
receiving a user password pwd; calculating to obtain the intensity weight of the user password:
Wherein: Representing the intensity weight of the user password pwd; Indicating the number of characters in the user password pwd that are repeated; representing the number of character types in the user password pwd, wherein the character types comprise numbers, letters and special symbols; Representing the total number of characters in the user password pwd; a set of characters representing the user password pwd, S represents a character setAny character in (a); representing the frequency with which characters s appear in the user personal information X; based on the password sequence and the intensity weight, calculating to obtain a security test result of the user password pwd:
Wherein: the method comprises the steps of representing a security test result of a user password pwd, wherein the higher the security test result is, the safer the user password pwd is, otherwise, the greater possibility of password cracking through personal information of the user is represented, and if the security test result is higher than a preset threshold value, the user password pwd passes the security test; Representing the information encoding process. According to the scheme, the real passwords of the users are obtained, intensity weight calculation is carried out based on the distribution of characters in the real passwords, the security test of the real passwords is carried out by combining the vector similarity of the passwords in the password sequence and the real passwords, and corresponding security test results are obtained, wherein the higher the security test results are, the safer the real passwords of the users are, otherwise, the greater possibility of password cracking through personal information of the users is indicated, and the password test based on sample induction is realized.
Drawings
Fig. 1 is a flow chart of a sample-induced password test method according to an embodiment of the invention.
Fig. 2 is a schematic structural diagram of an electronic device for implementing a sample-induced password test method according to an embodiment of the invention.
In the figure: 1 an electronic device, 10 a processor, 11a memory, 12 a program, 13 a communication interface.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a sample-induced password test method. The execution subject of the sample-induced password test method includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the sample-induced cryptographic testing method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Example 1:
s1: and collecting user personal information and preprocessing the user personal information to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample.
In the step S1, personal information of the user is collected and preprocessed, and preprocessed personal information data of the user is taken as a sample, and the method comprises the following steps:
Collecting personal information of a user, wherein the information representation form of the collected personal information of the user is as follows:
Wherein:
representing personal information of the user;
Identity information representing the user himself; the identity information comprises name, gender, age, birthday, identity card information, mobile phone number, mailbox name and login account name of website or software;
Identity information representing the immediate relatives of the user;
Representing the user's own experience information.
Personal information to the userPreprocessing to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample, wherein the user personal informationThe pretreatment flow of (2) is as follows:
s11: extracting identity information of a user from personal information X of the user And for identity informationRecoding name, birthday, ID card information, mailbox name and login account name, retaining other information, and forming ID information from the retained information and recoding resultIs a result of pretreatment of (a)
Wherein:
In turn, identity information Recoding results of name, gender, age, birthday, identification card information, mobile phone number, mailbox name and login account name of website or software, or reserved information.
S12: extracting identity information of direct relatives of user in personal information X of userAnd for identity informationRecoding name, birthday, ID card information, mailbox name and login account name, retaining other information, and forming ID information from the retained information and recoding resultIs a result of pretreatment of (a)
Wherein:
In turn, identity information Recoding results of name, gender, age, birthday, identification card information, mobile phone number, mailbox name and login account name of website or software, or reserved information.
S13: self experience information for user in personal information X of userPerforming word segmentation processing, and sequencing the words after word segmentationAs a means ofIs a result of the pretreatment of (a).
S14: constructing preprocessed user personal information dataPreprocessing the personal information data of the userAs a sample.
S2: and carrying out information coding on the samples to obtain sample vectors.
And in the step S2, the information encoding is carried out on the sample, and the method comprises the following steps:
and carrying out information coding on the samples to obtain sample vectors, wherein the information coding flow of the samples Y is as follows:
S21: and carrying out binary coding on the gender, age and mobile phone number of the identity information part in the sample.
S22: the recoding results of the name, the birthday, the identity card information, the mailbox name and the login account name of the identity information part in the sample are subjected to single-hot coding representation, the single-hot coding representation result is converted into a word vector form by utilizing a BERT model, the word vector is subjected to self-coding processing, and a self-coding result corresponding to each recoding result is obtained, wherein any recoding result is obtainedThe self-encoding formula of (2) is:
Wherein:
An exponential function that is based on a natural constant;
Representing arbitrary recoding results Is a self-encoding result of (a);
t represents a transpose;
Representing recoding results Is a single-hot encoding result;
Representing a convolution symbol;
representing a convolution parameter matrix;
Representing recoding results Is a convolution vector of (1);
A control parameter representing the length of the encoded vector;
Representing a self-encoding matrix;
Representing recoding results The recoding result set of the corresponding identity information type, wherein the identity information type comprises a name, a gender, an age, a birthday, identity card information, a mobile phone number, a mailbox name and a login account name of a website or software.
S23: extracting word segmentation sequences of self-experience information parts in the samples, carrying out single-heat coding representation on each word segmentation result, converting the single-heat coding representation result into a word vector form by utilizing a BERT model, carrying out self-coding processing on the word vector to obtain self-coding results corresponding to each word segmentation result, and forming the self-coding results of all the word segmentation results into the self-experience information part self-coding result.
S24: information encoding is carried out on the information data in the samples according to the steps S21 to S23 to obtain the encoding result of the information data, and a sample vector is formedWhereinRespectively isCorresponding self-coding result.
S3: and constructing a code sequence generation model, and carrying out optimization solving on the constructed code sequence generation model, wherein the code sequence generation model comprises a generator and a resolver, the generator takes a sample vector as input, the deep semantic feature of the sample vector as output, the resolver takes the deep semantic feature as input, and the code sequence as output.
And the step S3 is to construct a code sequence generation model, which comprises the following steps:
The method comprises the steps of constructing a code sequence generation model, wherein the code sequence generation model comprises a generator and a resolver, the generator takes a sample vector as input, the depth semantic feature of the sample vector as output, and the resolver takes the depth semantic feature as input and the code sequence as output.
And in the step S3, the constructed password sequence generation model is subjected to optimization solving, and the method comprises the following steps:
and carrying out optimization solution on the constructed password sequence generation model, wherein the optimization solution flow of the password sequence generation model is as follows:
S31: collecting a training data set data of a U-group sample vector and a user password composition model:
Wherein:
Representing the acquired u-th set of sample vectors, Representing the u-th set of sample vectorsCorresponding user password.
S32: and inputting the sample vectors into a code sequence generation model to generate code sequences corresponding to each group of sample vectors.
S33: constructing and obtaining a training objective function
Wherein:
Representing information encoding processing, wherein the flow of the information encoding processing is step S2;
and representing the model parameter vector to be optimally solved.
S34: initializing a model parameter vector to be optimally solvedSetting the current iteration number of the model parameter vector as t, the maximum iteration number as Max, the initial value of t as 0, and the t-th iteration result of the model parameter vector as follows
S35: calculating to obtain the iteration direction and the iteration step length of the model parameter vector:
Wherein:
Iteration step Satisfy the following requirementsRepresenting the step length of the t-th iteration;
Represents an L2 norm;
Representation of Is a gradient of (2);
Representing the iteration direction of the t-th iteration;
Is a parameter in the calculation process.
S36: iterating the model parameter vector:
let t=t+1, return to step S35 until reaching the maximum number of iterations, construct the cipher sequence generation model based on the model parameter vector obtained by the final iteration.
S4: and receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, generating a code sequence based on sample induction, and carrying out security test on a user code by using the generated code sequence.
And in the step S4, a code sequence generation model obtained by utilizing optimization solution is used for receiving a sample vector, and code sequence generation based on sample induction is performed, and the method comprises the following steps:
And receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, and generating a code sequence based on sample induction, wherein the code sequence generation flow is as follows:
S41: the generator receives the sample vector For sample vectors according to information typeDividing to obtain sample vector representing user identity information typeSample vector characterizing user immediate relative identity information typeSample vector characterizing the type of information experienced by the user himselfWhereinThe system sequentially comprises sample vectors for representing names, sexes, ages, birthdays, identity card information, mobile phone numbers, mailbox names and login account names of websites or software:
sequentially comprises sample vector representing name, gender, age, birthday, ID card information, mobile phone number, mailbox name and login account name of website or software
S42: depth semantic features are performed on different types of sample vectors respectively, wherein the sample vectorsSample vectorThe depth semantic feature extraction formula of (1) is respectively as follows:
Wherein:
Representing sample vectors Is a combination of the depth semantic features of (a),The self-coding result corresponding to the h word segmentation result in the self-experience information is represented, namely, a sample vector corresponding to the h word segmentation result in the self-experience information;
Representing sample vectors Is a depth semantic feature of (1);
Representing the semantic feature extraction matrix.
S43: the parser performs combined coding on information data corresponding to the sample vector based on the depth semantic features to generate a password sequence, wherein a combined coding formula is as follows:
Wherein:
Represents an L1 norm;
Representing depth semantic features Corresponding combined coding result, ifAbove a preset threshold, depth semantic features are determinedThe corresponding information data are arranged and combined to obtain a password in an induction way;
and representing the combined coding result corresponding to the plurality of depth semantic features.
S44: repeating the step S43, inducing to obtain N groups of passwords based on the sample vector, and generating a password sequence, wherein the representation form of the password sequence is as follows:
Wherein:
representing the generated nth set of passwords.
And in the step S4, the generated password sequence is utilized to carry out security test on the user password, and the method comprises the following steps:
And carrying out security test on the user password by utilizing the generated password sequence, wherein the security test flow of the user password is as follows:
a user password pwd is received.
Calculating to obtain the intensity weight of the user password:
Wherein:
Representing the intensity weight of the user password pwd;
Indicating the number of characters in the user password pwd that are repeated;
representing the number of character types in the user password pwd, wherein the character types comprise numbers, letters and special symbols;
Representing the total number of characters in the user password pwd;
a set of characters representing the user password pwd, S represents a character setAny character in (a);
Indicating how frequently the character s appears in the user personal information X.
Based on the password sequence and the intensity weight, calculating to obtain a security test result of the user password pwd:
Wherein:
the method comprises the steps of representing a security test result of a user password pwd, wherein the higher the security test result is, the safer the user password pwd is, otherwise, the greater possibility of password cracking through personal information of the user is represented, and if the security test result is higher than a preset threshold value, the user password pwd passes the security test;
an information encoding process is represented, wherein the flow of the information encoding process is step S2.
Example 2:
Fig. 2 is a schematic structural diagram of an electronic device for implementing a sample-induced password test method according to an embodiment of the invention.
The electronic device 1 may comprise a processor 10, a memory 11, a communication interface 13 and a bus, and may further comprise a computer program, such as program 12, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the program 12, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects respective parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules (a program 12 for realizing a sample-induced password test, etc.) stored in the memory 11, and invokes data stored in the memory 11 to perform various functions of the electronic device 1 and process the data.
The communication interface 13 may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device 1 and other electronic devices and to enable connection communication between internal components of the electronic device.
The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
Fig. 2 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 2 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The program 12 stored in the memory 11 of the electronic device 1 is a combination of instructions that, when executed in the processor 10, may implement:
And collecting user personal information and preprocessing the user personal information to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample.
And carrying out information coding on the samples to obtain sample vectors.
And constructing a code sequence generation model, and carrying out optimization solving on the constructed code sequence generation model.
And receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, generating a code sequence based on sample induction, and carrying out security test on a user code by using the generated code sequence.
Specifically, the specific implementation method of the above instruction by the processor 10 may refer to descriptions of related steps in the corresponding embodiments of fig. 1 to 2, which are not repeated herein.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (3)

1. A method of sample-induced cryptographic testing, the method comprising:
S1: collecting user personal information and preprocessing to obtain preprocessed user personal information data, wherein the preprocessed user personal information data is taken as a sample;
S2: information encoding is carried out on the samples to obtain sample vectors;
S3: constructing a code sequence generation model, and carrying out optimization solving on the constructed code sequence generation model, wherein the code sequence generation model comprises a generator and a resolver, the generator takes a sample vector as input, takes the depth semantic feature of the sample vector as output, and the resolver takes the depth semantic feature as input and takes the code sequence as output;
the optimal solving flow of the password sequence generation model is as follows:
S31: collecting a training data set data of a U-group sample vector and a user password composition model:
data={yu,pwdu|u∈[1,U]};
Wherein:
y u represents the u-th set of sample vectors collected, pwd u represents the user password corresponding to the u-th set of sample vectors y u;
s32: inputting the sample vectors into a code sequence generation model to generate code sequences corresponding to each group of sample vectors;
S33: constructing and obtaining a training objective function G (theta):
Wherein:
code (-) represents information encoding processing, wherein the flow of the information encoding processing is step S2;
θ represents a model parameter vector to be optimally solved;
S34: initializing a model parameter vector theta 0 to be optimally solved, setting the current iteration number of the model parameter vector as t, the maximum iteration number as Max, the initial value of t as 0, and the t-th iteration result of the model parameter vector as theta t;
s35: calculating to obtain the iteration direction and the iteration step length of the model parameter vector:
ct-1=g(θt)-g(θt-1);
qt-1=dt-1αt-1
Wherein:
the iteration step alpha t satisfies A step size representing the t-th iteration;
I, 2 represents an L2 norm;
G (θ t) represents a gradient of G (θ t);
d t denotes the iteration direction of the t-th iteration;
c t-1,qt-1,Min1 (t) is a parameter in the calculation process;
s36: iterating the model parameter vector:
θt+1=θttdt
Let t=t+1, return to step S35 until reaching the maximum iteration number, construct the cipher sequence and generate the model based on the model parameter vector that the final iteration gets;
S4: receiving a sample vector by using a code sequence generation model obtained by optimizing and solving, generating a code sequence based on sample induction, and performing security test on a user code by using the generated code sequence;
The password sequence generation flow is as follows:
S41: the generator receives a sample vector y, divides the sample vector y according to the information type to obtain a sample vector y 1 representing the identity information type of the user, a sample vector y 2 representing the direct relative identity information type of the user and a sample vector y 3 representing the experience information type of the user, wherein the sample vector y 1 sequentially comprises sample vectors representing the name, the sex, the age, the birthday, the identity card information, the mobile phone number, the mailbox name and the login account name of the website or the software:
y 2 sequentially contains sample vector representing name, gender, age, birthday, ID card information, mobile phone number, mailbox name and login account name of website or software
S42: depth semantic features are performed on different types of sample vectors, wherein the sample vector y 3 and the sample vectorThe depth semantic feature extraction formula of (1) is respectively as follows:
Wherein:
Representing sample vector/> Depth semantic features of/>The self-coding result corresponding to the h word segmentation result in the self-experience information is represented, namely, a sample vector corresponding to the h word segmentation result in the self-experience information;
Representing sample vector/> Is a depth semantic feature of (1);
W 2,W3 represents a semantic feature extraction matrix;
S43: the parser performs combined coding on information data corresponding to the sample vector based on the depth semantic features to generate a password sequence, wherein a combined coding formula is as follows:
Wherein:
I represent L1 norm;
Representing depth semantic features/> Corresponding combined coding result, if/>Above a preset threshold, depth semantic features/>The corresponding information data are arranged and combined to obtain a password in an induction way;
Representing a combined coding result corresponding to the plurality of depth semantic features;
S44: repeating the step S43, inducing to obtain N groups of passwords based on the sample vector, and generating a password sequence, wherein the representation form of the password sequence is as follows:
(pwd1,pwd2,...,pwdn,...,pwdN);
Wherein:
pwd n represents the generated nth set of passwords;
performing security test on the user password by using the generated password sequence;
The security test flow of the user password is as follows:
Receiving a user password pwd;
calculating to obtain the intensity weight of the user password:
Wherein:
w pwd denotes the intensity weight of the user password pwd;
count (pwd) indicates the number of characters in the user password pwd that are repeated;
label (pwd) represents the number of character types in the user password pwd, wherein the character types include numbers, letters and special symbols;
sum pwd represents the total number of characters in the user password pwd;
omega (pwd) represents the character set of the user password pwd, s e omega (pwd), s represents any character in the character set omega (pwd);
p s denotes the frequency with which the character s appears in the user personal information X;
Based on the password sequence and the intensity weight, calculating to obtain a security test result of the user password pwd:
Wherein:
safe pwd represents a security test result of the user password pwd, wherein the higher the security test result is, the safer the user password pwd is, otherwise, the greater possibility of realizing password cracking by the personal information of the user is represented, and if the security test result is higher than a preset threshold value, the user password pwd passes the security test;
code (·) represents information encoding processing, in which the flow of the information encoding processing is step S2.
2. The sample-induced password test method as claimed in claim 1, wherein the step S1 of collecting and preprocessing user personal information, taking preprocessed user personal information data as a sample comprises:
Collecting personal information of a user, wherein the information representation form of the collected personal information of the user is as follows:
X=(X1,X2,X3);
Wherein:
x represents personal information of the user;
X 1 represents identity information of the user himself; the identity information comprises name, gender, age, birthday, identity card information, mobile phone number, mailbox name and login account name of website or software;
X 2 represents identity information of the direct relatives of the user;
X 3 represents self-history information of the user;
preprocessing the user personal information X to obtain preprocessed user personal information data, and taking the preprocessed user personal information data as a sample, wherein the preprocessing flow of the user personal information X is as follows:
S11: extracting identity information X 1 of a user in personal information X of the user, recoding name, birthday, identity card information, mailbox name and login account name in the identity information X 1, reserving other information, and forming a preprocessing result Y 1 of the identity information X 1 by the reserved information and recoding result:
Wherein:
Recoding results of names, sexes, ages, birthdays, identity card information, mobile phone numbers, mailbox names and login account names of websites or software in the identity information X 1 or reserved information;
S12: extracting identity information X 2 of the direct relatives of the user in the personal information X of the user, recoding name, birthday, identity card information, mailbox name and login account name in the identity information X 2, reserving other information, and forming a preprocessing result Y 2 of the identity information X 2 by the reserved information and recoding result:
Wherein:
Recoding results of names, sexes, ages, birthdays, identity card information, mobile phone numbers, mailbox names and login account names of websites or software in the identity information X 2 or reserved information;
S13: performing word segmentation on self-experience information X 3 of a user in personal information X of the user, and taking a word sequence Y 3 after word segmentation as a preprocessing result of X 3;
S14: the preprocessed user personal information data y= (Y 1,Y2,Y3) is formed, and the preprocessed user personal information data Y is taken as a sample.
3. The method of claim 2, wherein the step of encoding the information of the sample in step S2 includes:
and carrying out information coding on the samples to obtain sample vectors, wherein the information coding flow of the samples Y is as follows:
S21: binary coding is carried out on the gender, age and mobile phone number of the identity information part in the sample;
S22: the recoding result of the name, the birthday, the identity card information, the mailbox name and the login account name of the identity information part in the sample is subjected to single-hot coding representation, the single-hot coding representation result is converted into a word vector form by utilizing a BERT model, the word vector is subjected to self-coding processing, and a self-coding result corresponding to each recoding result is obtained, wherein the self-coding formula of any recoding result x is as follows:
Qx=WQ*one_hot(x),Kx=WK*one_hot(x),Vx=WV*one_hot(x);
Wherein:
exp (·) represents an exponential function that bases on the natural constant;
x' represents the self-encoding result of any recoded result x;
t represents a transpose;
one_hot (x) represents a one-hot encoding result of the recoding result x;
* Representing a convolution symbol;
W Q,WK,WV represents a convolution parameter matrix;
Q x,Kx,Vx represents the convolution vector of the recoded result x;
dim represents a control parameter of the coding vector length;
w 1 represents a self-coding matrix;
Omega x represents a recoding result set of the identity information type corresponding to the recoding result x, wherein the identity information type comprises name, gender, age, birthday, identity card information, mobile phone number, mailbox name and login account name of website or software;
S23: extracting word segmentation sequences of self-experience information parts in the samples, carrying out single-heat coding representation on each word segmentation result, converting the single-heat coding representation result into a word vector form by utilizing a BERT model, carrying out self-coding processing on the word vector to obtain self-coding results corresponding to each word segmentation result, and forming the self-coding results of all the word segmentation results into the self-coding results of the self-experience information parts;
S24: and (3) performing information encoding on the information data in the samples according to the steps S21 to S23 to obtain an encoding result of the information data, and forming a sample vector y= (Y 1,y2,y3), wherein Y 1,y2,y3 is a self-encoding result corresponding to Y 1,Y2,Y3 respectively.
CN202410205659.4A 2024-02-26 2024-02-26 Sample-induced password test method Active CN117786664B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410205659.4A CN117786664B (en) 2024-02-26 2024-02-26 Sample-induced password test method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410205659.4A CN117786664B (en) 2024-02-26 2024-02-26 Sample-induced password test method

Publications (2)

Publication Number Publication Date
CN117786664A CN117786664A (en) 2024-03-29
CN117786664B true CN117786664B (en) 2024-05-24

Family

ID=90394827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410205659.4A Active CN117786664B (en) 2024-02-26 2024-02-26 Sample-induced password test method

Country Status (1)

Country Link
CN (1) CN117786664B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803035A (en) * 2016-11-30 2017-06-06 中国科学院信息工程研究所 A kind of password conjecture set creation method and password cracking method based on username information
CN108763918A (en) * 2018-04-10 2018-11-06 华东师范大学 A kind of password reinforcement method based on semantic transforms
CN109670303A (en) * 2018-12-26 2019-04-23 网智天元科技集团股份有限公司 The cryptographic attack appraisal procedure encoded certainly based on condition variation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10325091B2 (en) * 2016-08-25 2019-06-18 International Business Machines Corporation Generation of secure passwords in real-time using personal data
US10540490B2 (en) * 2017-10-25 2020-01-21 International Business Machines Corporation Deep learning for targeted password generation with cognitive user information understanding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803035A (en) * 2016-11-30 2017-06-06 中国科学院信息工程研究所 A kind of password conjecture set creation method and password cracking method based on username information
CN108763918A (en) * 2018-04-10 2018-11-06 华东师范大学 A kind of password reinforcement method based on semantic transforms
CN109670303A (en) * 2018-12-26 2019-04-23 网智天元科技集团股份有限公司 The cryptographic attack appraisal procedure encoded certainly based on condition variation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Deep Learning for Password Guessing and Password Strength Evaluation,A Survey;Tao Zhang等;TRUSTCOM 2020;20210209;全文 *
基于神经网络的定向口令猜测研究;周环;刘奇旭;崔翔;张方娇;;信息安全学报;20180915(第05期);全文 *
明文口令生成模型研究综述;周浩;王靖康;王博;罗宇韬;马泽文;刘功申;;计算机工程与应用;20180215(第04期);全文 *

Also Published As

Publication number Publication date
CN117786664A (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN112541745B (en) User behavior data analysis method and device, electronic equipment and readable storage medium
CN112667800A (en) Keyword generation method and device, electronic equipment and computer storage medium
CN111460807A (en) Sequence labeling method and device, computer equipment and storage medium
CN112560453B (en) Voice information verification method and device, electronic equipment and medium
CN111652279B (en) Behavior evaluation method and device based on time sequence data and readable storage medium
CN112380439B (en) Target object recommendation method and device, electronic equipment and computer readable storage medium
CN113688923B (en) Order abnormity intelligent detection method and device, electronic equipment and storage medium
CN111694844B (en) Enterprise operation data analysis method and device based on configuration algorithm and electronic equipment
CN111523094B (en) Deep learning model watermark embedding method and device, electronic equipment and storage medium
CN110110518B (en) Password strength evaluation method, device and computer readable storage medium
CN113821622B (en) Answer retrieval method and device based on artificial intelligence, electronic equipment and medium
CN113704614A (en) Page generation method, device, equipment and medium based on user portrait
CN113886708A (en) Product recommendation method, device, equipment and storage medium based on user information
CN113360654B (en) Text classification method, apparatus, electronic device and readable storage medium
CN117786664B (en) Sample-induced password test method
CN116578696A (en) Text abstract generation method, device, equipment and storage medium
CN116705304A (en) Multi-mode task processing method, device, equipment and medium based on image text
CN113515591A (en) Text bad information identification method and device, electronic equipment and storage medium
CN117271755B (en) Custom closed-loop rule engine management control method based on artificial intelligence
CN111680513B (en) Feature information identification method and device and computer readable storage medium
CN113822049B (en) Address auditing method, device, equipment and storage medium based on artificial intelligence
CN112328796B (en) Text clustering method, device, equipment and computer readable storage medium
CN114723523B (en) Product recommendation method, device, equipment and medium based on user capability image
CN115146627B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN111414452B (en) Search word matching method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant