CN115664739B - User identity attribute active detection method and system based on flow characteristic matching - Google Patents

User identity attribute active detection method and system based on flow characteristic matching Download PDF

Info

Publication number
CN115664739B
CN115664739B CN202211266240.7A CN202211266240A CN115664739B CN 115664739 B CN115664739 B CN 115664739B CN 202211266240 A CN202211266240 A CN 202211266240A CN 115664739 B CN115664739 B CN 115664739B
Authority
CN
China
Prior art keywords
flow
message
target
length
text message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211266240.7A
Other languages
Chinese (zh)
Other versions
CN115664739A (en
Inventor
郭山清
吕凤岩
胡程瑜
唐朋
刘成
刘晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202211266240.7A priority Critical patent/CN115664739B/en
Publication of CN115664739A publication Critical patent/CN115664739A/en
Application granted granted Critical
Publication of CN115664739B publication Critical patent/CN115664739B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method and a system for actively detecting user identity attribute based on flow characteristic matching, which are used for acquiring IP addresses of a target group and a target user client to be detected; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group; sending a detection text message to a target group through a first test user account; respectively grabbing the flow generated by the target user client and the flow generated by the second user account; extracting the encrypted traffic of the instant messaging application program; according to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account; and comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group.

Description

User identity attribute active detection method and system based on flow characteristic matching
Technical Field
The invention relates to the technical field of identity authentication, in particular to a method and a system for actively detecting user identity attribute based on flow characteristic matching.
Background
The statements in this section merely relate to the background of the present disclosure and may not necessarily constitute prior art.
Popular Instant Messaging (IM) applications basically deploy encryption schemes (end-to-end or end-to-intermediate to end) to secure users' communications. The existing detection method for the user identity attribute on the IM application program is a detection method based on vulnerability mining or a passive detection method based on flow characteristic matching, and the existing detection method has the problems of higher implementation difficulty, poorer effect and the like.
(1) As the IM application programs are all provided with the encryption schemes, vulnerability mining is mainly aimed at vulnerabilities of encryption protocols and BUG artificially caused in the protocol implementation process. Because the encryption schemes of the main IM application programs all adopt the most advanced encryption technology, the mining of protocol loopholes is often difficult to analyze and has little effect; the artificial BUG in the protocol implementation needs a solid knowledge base such as code analysis, reverse analysis and the like, and is influenced by various unstable factors. Meanwhile, once the vulnerability is discovered, the vulnerability is quickly repaired by the IM provider.
(2) IM applications are very user intensive and client environments vary. The passive detection method based on flow characteristic matching is limited by various factors, and although a certain effect can be achieved in the simulation environment, the actual effect is often not as good as that of the intention, because the actual environment of the target user is often more complex than the simulation environment, the detection time is required to be longer, and the success difficulty is higher.
The inventor discovers that in the prior art, the authentication of the user identity attribute cannot be realized under the condition of not revealing the user privacy.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a user identity attribute active detection method and system based on flow characteristic matching; and collecting information and flow of the added target group and channel through an account number of the IM application program, analyzing and obtaining the corresponding relation of the information and the channel, constructing an active detection text message, sending the active detection text message, and carrying out flow characteristic matching on the encrypted communication flow of the target user and the target group, so as to judge whether the target user has the identity attribute of the target group or not, and solve the problems of high detection difficulty, long time consumption, unstable accuracy and the like in a passive detection method.
In a first aspect, the present invention provides a method for actively detecting user identity attribute based on flow feature matching;
the method for actively detecting the user identity attribute based on the flow characteristic matching is applied to an instant messaging application program server and comprises the following steps:
Acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group; sending a detection text message to a target group through a first test user account;
In the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
According to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
and comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group.
In a second aspect, the invention provides a user identity attribute active detection system based on flow characteristic matching;
the user identity attribute active detection system based on flow characteristic matching is applied to an instant messaging application server, and the system comprises:
an acquisition module configured to: acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group;
A detect text message sending module configured to: sending a detection text message to a target group through a first test user account;
An encrypted traffic extraction module configured to: in the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
An event queue construction module configured to: according to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
An attribute detection module configured to: and comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group.
Compared with the prior art, the invention has the beneficial effects that:
(1) Compared with a method based on vulnerability discovery, the method solves the problems of high vulnerability discovery difficulty, multiple influence factors and the like, and improves the stability of the detection method to a great extent;
(2) Compared with a passive detection method based on flow characteristic matching, the method has the advantages that by constructing and actively sending the detection text message, an excellent detection effect can be achieved in a short time, and the problems of complex environment, long detection time consumption, unstable accuracy and the like of a target user are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a system block diagram of the present invention;
FIG. 2 is a diagram of a message and traffic structure of the present invention;
FIG. 3 is an algorithm block diagram of the flow feature matching module of the present invention;
FIG. 4 is a flowchart of the overall process of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular forms also are intended to include the plural forms, and furthermore, it is to be understood that the terms "comprises" and "comprising" and any variations thereof are intended to cover non-exclusive inclusions, such as, for example, processes, methods, systems, products or devices that comprise a series of steps or units, are not necessarily limited to those steps or units that are expressly listed, but may include other steps or units that are not expressly listed or inherent to such processes, methods, products or devices.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
All data acquisition in the embodiment is legal application of the data on the basis of meeting laws and regulations and agreements of users.
Term interpretation: the internet protocol address (english: internet Protocol Address, again translated as an internet protocol address), abbreviated as IP address (IP ADDRESS). The IP address is a unified address format provided by the IP protocol, which allocates a logical address to each network and each host on the internet, so as to mask the difference of physical addresses.
Example 1
The embodiment provides a user identity attribute active detection method based on flow characteristic matching;
the method for actively detecting the user identity attribute based on the flow characteristic matching is applied to an instant messaging application program server and comprises the following steps:
S101: acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group;
S102: sending a detection text message to a target group through a first test user account;
S103: in the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
S104: according to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
S105: and comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group.
Further, after the two test user accounts are added to the target group, before the test text message is sent to the target group through the first test user account, the method further includes: and acquiring historical text messages and flow corresponding to the historical text messages generated in the communication process of the target group through the first test user account, acquiring the mapping relation between the text message length and the flow length, counting the occurrence frequency of the text messages with different lengths, and constructing a detection text message according to the occurrence frequency of the text messages with different lengths.
The technical scheme can assist the supervision department in judging the identity attribute of the target user client with the known attribute for the target group with the known IP address, judging whether the target user client is added into the current target group or not, and judging whether the target user client has the attribute of the current target group or not. Helping the supervision department to make identity decisions for the target user clients.
The technical scheme can also assist the supervision department in judging the identity attribute of the users with all IP addresses for the target group with known attribute, judging whether the user client with all IP addresses is added into the current target group, and judging whether the user client has the attribute of the current target group. Helping the supervision department to make identity decisions for the target user clients.
For example, a target group has properties that are suspected of being "gambling," "telecom fraud," etc. In order to protect privacy rights of users, the instant messaging application server cannot provide basic information and chat contents of users for the supervision department, but can provide auxiliary analysis of whether some user client-side IPs are added into the group for the supervision department so as to help the supervision department to achieve the purpose of reminding the supervision department of safe use of funds of target user client-side.
Further, the flow corresponding to the historical text message generated in the communication process of the target group is obtained through the first test user account, specifically, the flow is obtained by adopting an application program interface API (Application Programming Interface) function and a network data packet grabbing program tcpdump tool.
It should be understood that, the obtained mapping relationship between the text message length and the traffic length has a certain corresponding relationship between the text message length and the encrypted traffic length because the transmitted message structure is fixed and the encryption algorithm is embedded in the program.
Further, the mapping relationship between the text message length and the flow length is obtained, and under the same time point, the corresponding relationship between the corresponding text message length and the flow length is obtained.
Further, the statistics of the occurrence frequency of the text messages with different lengths, and the construction of the detection text message according to the occurrence frequency of the text messages with different lengths specifically includes:
According to the occurrence frequency of the text messages with different lengths, the text message length with the lowest occurrence frequency is found, and according to the text message length with the lowest occurrence frequency, the detected text message which accords with the chat scene of the target group is constructed.
Further, according to the length of the text message with the lowest occurrence frequency, constructing a detection text message conforming to the chat scene of the target group, and realizing the detection text message through a trained neural network; wherein, the neural network after training, the training process includes:
Constructing a training set; the training set is a target group historical text message and a target text length of a known text message theme; taking the text message theme and the target text length as input values of the neural network, and taking the historical text message as output values of the neural network; training the neural network to obtain the trained neural network, wherein the length of an output value of the neural network is the length of the target text.
Illustratively, the text message subject includes: "gambling", "telecommunications fraud", etc.
Further, the step of constructing the detected text message conforming to the chat scene of the target group according to the text message length with the lowest occurrence frequency specifically refers to:
Taking the subject of the text message of the target group and the length of the text message with the lowest occurrence frequency as input values of a neural network, and outputting a detected text message conforming to the chat scene of the target group; the length of the detected text message conforming to the target group chat scene is the text message length with the lowest occurrence frequency.
Further, in the process of detecting the sending of the text message, the capturing the flow generated by the target user client and the flow generated by the second user account respectively means: in the process of sending the detection text message, according to the sending time of the detection text message, the flow generated by the target user client and the flow generated by the second user account are respectively grabbed by using a network data packet grabbing program tcpdump tool.
It should be understood that the present invention assumes that the second user account only joins the target group; while the target user client allows to join the one of the target groups, not join the target group, or join other groups in addition to the target group.
Further, the detection text message is sent to the target group, and the detection text message is actively sent by adopting methods such as an API function of an IM application program or a software wizard script.
Further, the traffic captured by the target user client and the second test user account is respectively extracted by the instant messaging application program encryption traffic, wherein the encryption traffic refers to traffic which is sent to the internet after all the contents needing to be encrypted are encrypted through an encryption algorithm according to the protocol specification of the instant messaging application program; all content that needs to be encrypted, including: text messages, message attachment headers, and message lengths.
Further, the extracting the encrypted traffic of the instant messaging application program is performed on the traffic captured by the target user client and the second test user account, wherein the extracting process of the encrypted traffic includes:
Determining the length of the encrypted flow of the instant messaging application program according to the length of the detected text message and the mapping relation between the length of the text message and the length of the flow;
And extracting the encrypted traffic of the instant messaging application program according to the length of the encrypted traffic of the instant messaging application program.
Further, the encrypting traffic extracted from the target user client and the second test user account is respectively constructed into message traffic event queues, wherein each message traffic event refers to a traffic packet with two attributes of time and length, and each message traffic event queue refers to a plurality of message traffic events which are arranged according to time sequence relations.
The message flow event queue is constructed by the following steps: and converting the time and length attribute of each network data packet in the encrypted traffic into message traffic events one by one, and arranging according to the time sequence relationship to obtain a message traffic event queue.
Further, the calculating the optimal association degree of the message traffic event queues of the target user client and the second test user account includes:
Setting the upper limit and the lower limit of the time difference of the two message flow event queues, then setting the sliding time of each step, gradually sliding the two text event queues, detecting the number of events which can be associated to the second test user account event queue in the target user client event queue, and obtaining association degree compared with the total number of the events of the second test user account event queue, wherein the obtained association degree is the optimal association degree; wherein, the two events can be associated means that the time difference and the length difference of the two events are within a set range.
The method comprises the steps of traversing encrypted traffic packets of a target user and a target group, extracting key characteristics such as time, length and the like of the traffic packets which accord with the length mapping relation of the detected text message, obtaining message traffic event queues of the target user and the target group, sliding windows of the two queues in a front-back certain limit by taking 0.1s as a span, removing interference factors of network jitter, and obtaining the optimal association degree of the two queues.
Further, comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group, specifically including:
if the optimal association degree is smaller than the set threshold value, the target user is not in the target group; a value greater than the threshold indicates that the target user is in the target group.
Or comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group, which specifically comprises the following steps:
judging whether the target user has the identity attribute of the target group or not, wherein the reality-based hypothesis test is as follows:
Considering G as the target group, for any IM user U, the goal of the detection is to decide which of the following hypotheses is true:
h0: the user U is not associated with the target group G, namely the user U is not a member of the group G, and the association degree is smaller than the judgment threshold;
h1: the user U is associated with the target group G, namely, the user U is a member in the group G, and the association degree is larger than the judging threshold.
As shown in fig. 1, a system configuration diagram of the present invention is shown.
The message and its encryption flow mapping modeling part of the system adds the target group and channel through the API function by using the IM application program account, calls the API function to obtain the message record, and simultaneously uses the tcpdump tool to grasp the encryption communication flow of the IM application program, analyzes and obtains the mapping relation between the message and the flow packet, and counts the characteristics of their length, time, appearance frequency, etc., because the main IM application program has limitation on the call times of the API function added into the group, this step needs to consume a certain time.
The statistical model building step is as follows:
input: recording json files and traffic pcap files by the information;
And (3) outputting: model feature mod file;
① Reading message structure body messages [ ] from a message record json file;
② Reading the packet structures packet from the file of the flow pcap; acquiring the message length in messages;
③ Traversing messages to extract the length and time information of each text message;
④ Traversing packets to extract the length and time information of the encrypted flow packet;
⑤ Traversing messages and packets, and obtaining the mapping relation between the messages and the packets according to the time sequence and the length characteristics;
⑥ Counting the occurrence frequency of each length;
⑦ A mod file is created and each feature node is written.
The detection text message construction and active transmission part is further designed to construct a detection text message, and the detection text message is actively transmitted to the target group by using an API function of the IM application program or a software wizard script and other methods.
And the flow acquisition and filtration part respectively utilizes a tcpdump tool to capture the flows of the target user and the target group, traverses all flow packets, and filters and extracts the IM encrypted communication flow according to the length characteristics.
And the flow characteristic matching part filters and screens redundant packets such as IM application heartbeat flow packets and the like according to the length characteristics of the encrypted flow packets, analyzes the encrypted flows of the target user and the target group to respectively obtain message flow event queues of the target user and the target group, performs window sliding with a certain limit on the front and back of the two queues by taking 0.1s as a span, removes interference factors of network jitter, and obtains the optimal association degree of the two queues.
And a hypothesis judging part for setting a judging threshold according to experimental results of a plurality of experiments and judging whether the matching degree is larger than the threshold by combining the obtained optimal association degree with a hypothesis test based on reality. If the identity attribute is larger than the threshold value, the target user is associated with the target group and has the identity attribute of the target group; if the identity attribute is smaller than the threshold value, the target user is not associated with the target group and does not have the identity attribute of the target group.
As shown in fig. 2, the message and traffic structure diagram of the present invention is shown. Among the linked list of the links in the list,
Node [ "text_length" ] stores the Length information of the Text message;
node [ "packet_length" ] stores the Length of the encrypted traffic Packet corresponding to the message;
node [ "time_diff" ] is the Time difference between the text message sending Time and the Time of arrival of the encrypted traffic packet forwarded by the server at the client, which is used to determine the Time error impact of actively detecting the text message;
The node [ "num_ Lengtha" ] stores the frequency information of the occurrence of these text message lengths and the corresponding encrypted traffic packet lengths.
As shown in fig. 3, an algorithm structure diagram of the flow characteristic matching module of the present invention is shown. T min and T max represent the lower and upper time difference limits of network jitter, respectively, N G is the total number of message traffic events for the target group, and N max is the maximum number of associated events.
As shown in fig. 4, an overall process flow diagram of the present invention is shown.
According to the account number of the instant messaging application program, collecting messages and capturing flow, counting the occurrence frequency of text messages with different lengths, and establishing a mapping relation between the text message length and the flow length; constructing and detecting text messages according to the occurrence frequency of the text messages with different lengths and actively sending the text messages; grabbing and detecting the flow of a target user and a target group in the message sending process, and filtering and extracting the IM application program encryption flow in the flow; analyzing the encrypted flow of the IM application program, extracting message flow event queues of the target user and the target group, and obtaining the optimal association degree between the two queues; and according to the set judgment threshold, judging whether the target user is associated with the target group or not by combining the association degree obtained by the hypothesis test sum based on reality. The method solves the problems of high detection difficulty, long time consumption, unstable accuracy and the like of the detection method based on vulnerability mining and the passive detection method based on flow characteristic matching, and is more suitable for active detection of the user identity attribute based on a real scene.
Example two
The embodiment provides a user identity attribute active detection system based on flow characteristic matching;
the user identity attribute active detection system based on flow characteristic matching is applied to an instant messaging application server, and the system comprises:
an acquisition module configured to: acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group;
A detect text message sending module configured to: sending a detection text message to a target group through a first test user account;
An encrypted traffic extraction module configured to: in the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
An event queue construction module configured to: according to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
An attribute detection module configured to: and comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group.
Further, the detecting text message sending module further includes: and acquiring historical text messages and flow corresponding to the historical text messages generated in the communication process of the target group through the first test user account, acquiring the mapping relation between the text message length and the flow length, counting the occurrence frequency of the text messages with different lengths, and constructing a detection text message according to the occurrence frequency of the text messages with different lengths.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The active detection method of the user identity attribute based on the flow characteristic matching is characterized by being applied to an instant messaging application program server, and comprises the following steps:
Acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group; acquiring historical text messages and flow corresponding to the historical text messages generated in the communication process of the target group through a first test user account, acquiring the mapping relation between the text message length and the flow length, counting the occurrence frequency of the text messages with different lengths, and constructing a detection text message according to the occurrence frequency of the text messages with different lengths; sending a detection text message to a target group through a first test user account;
In the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second test user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
According to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group;
The encryption flow extracted from the target user client and the second test user account is respectively constructed into message flow event queues, wherein each message flow event refers to a flow packet with two attributes of time and length, and each message flow event queue refers to a plurality of message flow events which are arranged according to time sequence; the message flow event queue is constructed by the following steps: converting each network data packet in the encrypted traffic into message traffic events one by one according to the time and length attributes of each network data packet in the encrypted traffic, and arranging the message traffic events according to the time sequence relationship to obtain a message traffic event queue;
The calculating the optimal association degree of the message flow event queues of the target user client and the second test user account comprises the following specific processes: setting the upper limit and the lower limit of the time difference of the two message flow event queues, then setting the sliding time of each step, gradually sliding the two text event queues, detecting the number of events which can be associated to the second test user account event queue in the target user client event queue, and obtaining association degree compared with the total number of the events of the second test user account event queue, wherein the obtained association degree is the optimal association degree; wherein, the fact that two events can be associated means that the time difference and the length difference of the two events are within a set range;
Comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group or not, wherein the method specifically comprises the following steps: if the optimal association degree is smaller than the set threshold value, the target user is not in the target group; a value greater than the threshold indicates that the target user is in the target group.
2. The method for actively detecting user identity attribute based on flow feature matching according to claim 1, wherein the statistics of the occurrence frequencies of text messages with different lengths, and the construction of the detected text message according to the occurrence frequencies of the text messages with different lengths, specifically comprises the following steps:
According to the occurrence frequency of the text messages with different lengths, the text message length with the lowest occurrence frequency is found, and according to the text message length with the lowest occurrence frequency, the detected text message which accords with the chat scene of the target group is constructed.
3. The method for actively detecting user identity attribute based on flow characteristic matching according to claim 2, wherein the method is characterized in that the method is realized by training a neural network according to the text message length with the lowest occurrence frequency to construct a detected text message conforming to a chat scene of a target group; wherein, the neural network after training, the training process includes:
Constructing a training set; the training set is a target group historical text message and a target text length of a known text message theme; taking the text message theme and the target text length as input values of the neural network, and taking the historical text message as output values of the neural network; training the neural network to obtain a trained neural network, wherein the length of an output value of the neural network is the length of a target text;
The method comprises the steps of constructing a detection text message conforming to a chat scene of a target group according to the length of the text message with the lowest occurrence frequency, and specifically comprises the following steps:
Taking the subject of the text message of the target group and the length of the text message with the lowest occurrence frequency as input values of a neural network, and outputting a detected text message conforming to the chat scene of the target group; the length of the detected text message conforming to the target group chat scene is the text message length with the lowest occurrence frequency.
4. The method for actively detecting user identity attribute based on flow feature matching according to claim 1, wherein in the process of detecting text message sending, the flow generated by the target user client and the flow generated by the second test user account are respectively grabbed, which means that: in the process of sending the detection text message, according to the sending time of the detection text message, the flow generated by the target user client and the flow generated by the second test user account are respectively grabbed by using a network data packet grabbing program tcpdump tool.
5. The method for actively detecting user identity attribute based on flow feature matching according to claim 1, wherein the flow captured by the target user client and the second test user account is respectively extracted by encrypting flow of an instant messaging application program, wherein the encrypted flow refers to the flow which is sent to the internet after all contents to be encrypted are encrypted by an encryption algorithm according to protocol rules of the instant messaging application program; all content that needs to be encrypted, including: text message, message attachment header and message length;
The extraction process of the encrypted traffic comprises the following steps: determining the length of the encrypted flow of the instant messaging application program according to the length of the detected text message and the mapping relation between the length of the text message and the length of the flow; and extracting the encrypted traffic of the instant messaging application program according to the length of the encrypted traffic of the instant messaging application program.
6. The system is characterized by being applied to an instant messaging application program server, and comprises the following components:
an acquisition module configured to: acquiring an IP address of a target group and a target user client to be detected, wherein the attribute of the target group is a known quantity; constructing a first test user account and a second test user account, and adding the two test user accounts into a target group; acquiring historical text messages and flow corresponding to the historical text messages generated in the communication process of the target group through a first test user account, acquiring the mapping relation between the text message length and the flow length, counting the occurrence frequency of the text messages with different lengths, and constructing a detection text message according to the occurrence frequency of the text messages with different lengths;
A detect text message sending module configured to: sending a detection text message to a target group through a first test user account;
An encrypted traffic extraction module configured to: in the process of detecting text message sending, respectively grabbing the flow generated by the target user client and the flow generated by the second test user account; respectively extracting the encrypted flow of the instant messaging application program from the flow captured by the target user client and the second test user account;
An event queue construction module configured to: according to the mapping relation between the text message length and the flow length, respectively constructing message flow event queues for the encrypted flows extracted by the target user client and the second test user account; calculating the optimal association degree of the message flow event queues of the target user client and the second test user account;
An attribute detection module configured to: comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group;
The encryption flow extracted from the target user client and the second test user account is respectively constructed into message flow event queues, wherein each message flow event refers to a flow packet with two attributes of time and length, and each message flow event queue refers to a plurality of message flow events which are arranged according to time sequence; the message flow event queue is constructed by the following steps: converting each network data packet in the encrypted traffic into message traffic events one by one according to the time and length attributes of each network data packet in the encrypted traffic, and arranging the message traffic events according to the time sequence relationship to obtain a message traffic event queue;
The calculating the optimal association degree of the message flow event queues of the target user client and the second test user account comprises the following specific processes: setting the upper limit and the lower limit of the time difference of the two message flow event queues, then setting the sliding time of each step, gradually sliding the two text event queues, detecting the number of events which can be associated to the second test user account event queue in the target user client event queue, and obtaining association degree compared with the total number of the events of the second test user account event queue, wherein the obtained association degree is the optimal association degree; wherein, the fact that two events can be associated means that the time difference and the length difference of the two events are within a set range;
Comparing the optimal association degree with a set threshold value to obtain whether the target user client has the attribute of the target group or not, wherein the method specifically comprises the following steps: if the optimal association degree is smaller than the set threshold value, the target user is not in the target group; a value greater than the threshold indicates that the target user is in the target group.
CN202211266240.7A 2022-10-17 2022-10-17 User identity attribute active detection method and system based on flow characteristic matching Active CN115664739B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211266240.7A CN115664739B (en) 2022-10-17 2022-10-17 User identity attribute active detection method and system based on flow characteristic matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211266240.7A CN115664739B (en) 2022-10-17 2022-10-17 User identity attribute active detection method and system based on flow characteristic matching

Publications (2)

Publication Number Publication Date
CN115664739A CN115664739A (en) 2023-01-31
CN115664739B true CN115664739B (en) 2024-05-07

Family

ID=84988076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211266240.7A Active CN115664739B (en) 2022-10-17 2022-10-17 User identity attribute active detection method and system based on flow characteristic matching

Country Status (1)

Country Link
CN (1) CN115664739B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458182A (en) * 2019-06-24 2019-11-15 中国科学院信息工程研究所 Based on the matched online vest detection method of similar subgraph
CN110998588A (en) * 2017-08-22 2020-04-10 微软技术许可有限责任公司 Reducing text length while preserving meaning
CN113312560A (en) * 2021-06-16 2021-08-27 百度在线网络技术(北京)有限公司 Group detection method and device and electronic equipment
CN113420230A (en) * 2020-12-31 2021-09-21 深圳市镜玩科技有限公司 Matching consultation pushing method based on group chat, related device, equipment and medium
CN113521749A (en) * 2021-07-15 2021-10-22 珠海金山网络游戏科技有限公司 Abnormal account detection model training method and abnormal account detection method
WO2022148050A1 (en) * 2021-01-05 2022-07-14 华为云计算技术有限公司 Traffic management method and apparatus, traffic management strategy configuration method and apparatus, and device and medium
CN114818974A (en) * 2022-05-23 2022-07-29 北京航空航天大学 Inference attack method and system for monitoring user activities in intelligent information system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200236131A1 (en) * 2019-01-18 2020-07-23 Cisco Technology, Inc. Protecting endpoints with patterns from encrypted traffic analytics

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110998588A (en) * 2017-08-22 2020-04-10 微软技术许可有限责任公司 Reducing text length while preserving meaning
CN110458182A (en) * 2019-06-24 2019-11-15 中国科学院信息工程研究所 Based on the matched online vest detection method of similar subgraph
CN113420230A (en) * 2020-12-31 2021-09-21 深圳市镜玩科技有限公司 Matching consultation pushing method based on group chat, related device, equipment and medium
WO2022148050A1 (en) * 2021-01-05 2022-07-14 华为云计算技术有限公司 Traffic management method and apparatus, traffic management strategy configuration method and apparatus, and device and medium
CN113312560A (en) * 2021-06-16 2021-08-27 百度在线网络技术(北京)有限公司 Group detection method and device and electronic equipment
CN113521749A (en) * 2021-07-15 2021-10-22 珠海金山网络游戏科技有限公司 Abnormal account detection model training method and abnormal account detection method
CN114818974A (en) * 2022-05-23 2022-07-29 北京航空航天大学 Inference attack method and system for monitoring user activities in intelligent information system

Also Published As

Publication number Publication date
CN115664739A (en) 2023-01-31

Similar Documents

Publication Publication Date Title
Bhuyan et al. Towards Generating Real-life Datasets for Network Intrusion Detection.
Phan et al. OpenFlowSIA: An optimized protection scheme for software-defined networks from flooding attacks
CN106034056B (en) Method and system for analyzing business safety
KR101070614B1 (en) Malicious traffic isolation system using botnet infomation and malicious traffic isolation method using botnet infomation
CN106506242B (en) Accurate positioning method and system for monitoring network abnormal behaviors and flow
CN101741862B (en) System and method for detecting IRC bot network based on data packet sequence characteristics
CN106416171A (en) Method and device for feature information analysis
CN101911614A (en) Systems and processes of identifying p2p applications based on behavioral signatures
CN111222019B (en) Feature extraction method and device
CN101635720B (en) Filtering method of unknown flow rate and bandwidth management equipment
CN107204965B (en) Method and system for intercepting password cracking behavior
US20210385240A1 (en) Low-complexity detection of potential network anomalies using intermediate-stage processing
Shanthi et al. Detection of botnet by analyzing network traffic flow characteristics using open source tools
CN111131332A (en) Network service interconnection and flow acquisition, analysis and recording system
KR102129375B1 (en) Deep running model based tor site active fingerprinting system and method thereof
CN115664739B (en) User identity attribute active detection method and system based on flow characteristic matching
Campbell et al. Intrusion detection at 100G
Yang et al. Modelling Network Traffic and Exploiting Encrypted Packets to Detect Stepping-stone Intrusions.
Freire et al. On metrics to distinguish skype flows from http traffic
CN114866362B (en) Campus network addiction prevention method and system
JP2006164038A (en) Method for coping with dos attack or ddos attack, network device and analysis device
Oujezsky et al. Botnet C&C traffic and flow lifespans using survival analysis
CN113596037B (en) APT attack detection method based on event relation directed graph in network full flow
CN101668034A (en) Method for recognizing two voice flows of Skype in real time
Swinnen et al. ProtoLeaks: A reliable and protocol-independent network covert channel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant