WO2020258505A1 - Network access security determination method and apparatus - Google Patents

Network access security determination method and apparatus Download PDF

Info

Publication number
WO2020258505A1
WO2020258505A1 PCT/CN2019/103646 CN2019103646W WO2020258505A1 WO 2020258505 A1 WO2020258505 A1 WO 2020258505A1 CN 2019103646 W CN2019103646 W CN 2019103646W WO 2020258505 A1 WO2020258505 A1 WO 2020258505A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature set
combination feature
nonlinear combination
data points
local outlier
Prior art date
Application number
PCT/CN2019/103646
Other languages
French (fr)
Chinese (zh)
Inventor
黎立桂
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020258505A1 publication Critical patent/WO2020258505A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

The present application relates to the field of security detection technology, and provides a network access security determination method and apparatus. Said method comprises: obtaining, according to first device parameters of historical network access of a terminal device, feature information thereof to generate a plurality of first non-linear combination feature sets; acquiring, by means of a script program on the terminal device, second device parameters of the current network access of the terminal device, and extracting corresponding feature information to generate a second non-linear combination feature set; calculating, according to clustering of the first non-linear combination feature sets, a local outlier factor of data points of the second non-linear combination feature set by using an unsupervised clustering outlier detection algorithm, and using the value of a maximum steep point of the local outlier factor as a determination threshold; and determining that the network access is a secure access when the value of the local outlier factor of the data points of the second non-linear combination feature set is greater than the determination threshold. Said method facilitates improvement of the security detection capability of the terminal device for a current network access.

Description

网络访问的安全判定方法和装置Security judgment method and device for network access
本申请要求于2019年6月28日提交中国专利局、申请号为201910578479.X,发明名称为“网络访问的安全判定方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 28, 2019, the application number is 201910578479.X, and the invention title is "Network Access Security Judgment Method and Device", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及安全检测技术领域,具体而言,本申请涉及一种网络访问的安全判定方法和装置。This application relates to the technical field of security detection. Specifically, this application relates to a method and device for determining network access security.
背景技术Background technique
随着网络技术的广泛应用,网络安全也同样得到重视。网络安全的其中一个体现是网站安全容易受到威胁。在目前威胁网站安全的主要手段之一是通过网络爬虫访问网站,导致网站不能正确辨别正常的网络访问。针对这个问题,目前的方法是通过采集网络访问中的终端设备所产生的点击和拖动轨迹的操作数据,判断该网络访问是否安全。发明人意识到,该方法并不能完全准确辨认安全的网络访问,容易将安全的网络访问辨认为非安全网络访问,影响用户的体验。With the widespread application of network technology, network security has also received attention. One manifestation of network security is that website security is vulnerable to threats. At present, one of the main means of threatening website security is to visit the website through web crawlers, which causes the website to be unable to correctly distinguish normal network access. To solve this problem, the current method is to collect the operation data of the click and drag track generated by the terminal device in the network access to determine whether the network access is safe. The inventor realizes that the method cannot completely accurately identify secure network access, and it is easy to recognize secure network access as non-secure network access, which affects user experience.
发明内容Summary of the invention
为克服以上技术问题,特别是现有技术中通过终端设备登录网络时,用户的使用痕迹数据不能完全辨认安全的网络访问的问题,特提出以下技术方案:In order to overcome the above technical problems, especially the problem that the user's usage trace data cannot fully identify secure network access when logging on to the network through terminal equipment in the prior art, the following technical solutions are proposed:
第一方面,本申请提供一种网络访问的安全判定方法,包括以下步骤:In the first aspect, this application provides a method for determining the security of network access, including the following steps:
根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first non-linear combination feature sets;
通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
为解决上述技术问题,本申请还提供一种网络访问的安全判定装置,其包括:To solve the above technical problems, this application also provides a network access security determination device, which includes:
第一生成模块,用于根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;The first generating module is configured to obtain the characteristic information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first nonlinear combination characteristic sets;
第二生成模块,用于通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;The second generation module is configured to obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
计算模块,用于采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;The calculation module is configured to adopt an outlier detection algorithm of unsupervised clustering, use the data points of the second nonlinear combination feature set as detection parameters, and calculate the said first nonlinear combination feature set according to the clustering The local outlier factor of the data point of the second nonlinear combination feature set, and the maximum steep point value of the local outlier factor is used as the judgment threshold;
判定模块,用于当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;A determining module, configured to determine that the network access is a safe access when the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold;
其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
为解决上述问题,本申请还提供一种服务器,其包括:In order to solve the above problems, this application also provides a server, which includes:
一个或多个处理器;One or more processors;
存储器;Memory
一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所 述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行上述实施例所述的网络访问的安全判定方法,其中,所述网络访问的安全判定方法,包括:One or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, and the one or more computer programs are configured to execute In the method for determining the security of network access in the foregoing embodiment, the method for determining the security of network access includes:
根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first non-linear combination feature sets;
通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
为解决上述问题,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例所述的网络访问的安全判定方法,其中,所述网络访问的安全判定方法,包括:In order to solve the above problems, this application also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the security determination of network access described in the above-mentioned embodiment is realized A method, wherein the method for determining the security of network access includes:
根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first non-linear combination feature sets;
通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
本申请所提供的一种网络访问的安全判定方法和装置,对所述历史采集的终端设备的生成多个第一非线性组合特征集的数据点与终端设备当前网络访问生成的第二非线性组合特征集的数据点的空间位置得到对应的局部离群点因子,并根据所述局部离群点因子与多个局部离群点因子所得到的曲线的最大陡点的值进行比较,得到所述终端设备当前网络访问是否为异常访问的判定结果。The security judgment method and device for network access provided by the present application are based on the data points that generate multiple first non-linear combination feature sets of the terminal equipment collected in history and the second non-linearity generated by the current network access of the terminal equipment. Combine the spatial positions of the data points of the feature set to obtain the corresponding local outlier factor, and compare the value of the maximum steep point of the curve obtained according to the local outlier factor and multiple local outlier factors to obtain The judgment result of whether the current network access of the terminal device is abnormal access.
进一步地,本申请所提供的技术方案运用了采用无监督聚类的离群点检测算法,得到判定依据的值并得到相应的判定结果,且不需要对终端设备发起网络访问的特征信息数据进行标注,节省了后期统计和分析的工作量;而且该方案使相应数据实现可视化,结果直观,可容易得到准确率较高的判定结果,最终提高所述网络访问的安全判定方法和装置的判定效果。Further, the technical solution provided by this application uses an outlier detection algorithm using unsupervised clustering to obtain the value of the judgment basis and obtain the corresponding judgment result, and does not need to perform network access to the characteristic information data of the terminal device. Labeling saves the workload of post-statistics and analysis; moreover, the solution visualizes the corresponding data, the result is intuitive, and the judgment result with higher accuracy can be easily obtained, and finally the judgment effect of the network access security judgment method and device is improved .
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.
附图说明Description of the drawings
图1是本申请中的实施例执行网络访问的安全判定方案的应用环境图;FIG. 1 is a diagram of the application environment of the implementation of the security determination solution for network access in the embodiment of the present application;
图2是本申请中的一个实施例的网络访问的安全判定方法的流程图;FIG. 2 is a flowchart of a method for determining the security of network access according to an embodiment of the present application;
图3是本申请中的另一个实施例的网络访问的安全判定方法的流程图;FIG. 3 is a flowchart of a method for determining network access security according to another embodiment of the present application;
图4为本申请中的一个实施例的网络访问的安全判定装置的示意图;4 is a schematic diagram of a device for determining network access security according to an embodiment of the application;
图5为本申请中的一个实施例的服务器的结构示意图。Figure 5 is a schematic structural diagram of a server according to an embodiment of the application.
具体实施方式Detailed ways
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本申请的限制。The embodiments of the present application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals indicate the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the drawings are exemplary, and are only used to explain the present application, and cannot be construed as a limitation to the present application.
参考图1所示,图1是本申请实施例方案的应用环境图;该实施例中,本申请技术方案可以基于服务器上实现,如图1中,终端设备110和120可以通过internet网络访问服务器130,终端设备110和/或120向服务器130发出的网络请求,服务器130根据网络请求进行数据交互。在进行数据交互时,服务器130根据终端设备110和/或120的请求信息获取终端设备110和/或120的访问数据和属性数据,并根据该数据对该终端设备进行安全判定。As shown in Figure 1, Figure 1 is an application environment diagram of the embodiment of the application; in this embodiment, the technical solution of the application can be implemented on a server. In Figure 1, the terminal devices 110 and 120 can access the server through the internet 130. The terminal device 110 and/or 120 sends a network request to the server 130, and the server 130 performs data interaction according to the network request. During data exchange, the server 130 obtains the access data and attribute data of the terminal device 110 and/or 120 according to the request information of the terminal device 110 and/or 120, and performs security judgment on the terminal device according to the data.
为了解决目前安全判定方法不容易辨认安全的网络访问的问题,本申请提供了一种网络访问的安全判定方法。可参考图2,图2是一个实施例的网络访问的安全判定方法的流程图,该方法包括以下步骤:In order to solve the problem that it is not easy to identify secure network access in current security determination methods, this application provides a security determination method for network access. Refer to FIG. 2, which is a flowchart of a method for determining network access security according to an embodiment. The method includes the following steps:
S210、根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集。S210. Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first nonlinear combination feature sets.
服务器与终端设备进行数据交互的时候,根据终端设备发出的网络请求,获取该终端设备的相关参数。在该步骤中,服务器从终端设备所发出的历史网络访问的请求中得到第一设备参数,服务器对该第一设备参数进行解析,并根据解析的结果获取生成多个第一非线性组合特征集。When the server interacts with the terminal device, it obtains the relevant parameters of the terminal device according to the network request sent by the terminal device. In this step, the server obtains the first device parameter from the historical network access request sent by the terminal device, the server analyzes the first device parameter, and obtains and generates a plurality of first nonlinear combination feature sets according to the analysis result .
所述第一非线性组合特征集是与服务器进行过数据交互的终端设备一个访问记录生成的特征集,该第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据。例如属性数据可包括终端设备的型号、终端设备的屏幕分辨率x*y或浏览器的可用屏幕分辨率X*Y,访问数据可包括终端设备向服务器发出请求的频率等。The first nonlinear combination feature set is a feature set generated by an access record of a terminal device that has interacted with the server, and the first nonlinear combination feature set is nonlinear feature information of the terminal device obtained by historical network access. The characteristic information includes attribute data and access data of the terminal device. For example, the attribute data may include the model of the terminal device, the screen resolution x*y of the terminal device, or the available screen resolution X*Y of the browser, and the access data may include the frequency of the terminal device requesting the server.
所述第一非线性组合特征集对应的特征信息,在本实施例中,该特征信息具体为对应的特征值。设定对应的坐标,并在坐标上标注历史的终端设备每一次的访问记录生成的特征集或一个n维数据点。关于不同访问记录形成的特征集在坐标上形成对应的正常状态簇和异常状态簇。根据正常 情况绝对大于异常情况的考虑,大簇是正常状态簇,小簇是异常状态簇。The feature information corresponding to the first non-linear combination feature set, in this embodiment, the feature information is specifically a corresponding feature value. Set the corresponding coordinates, and mark the characteristic set or an n-dimensional data point generated by each access record of the historical terminal device on the coordinates. The feature sets formed by different access records form corresponding normal state clusters and abnormal state clusters on the coordinates. According to the consideration that the normal situation is definitely greater than the abnormal situation, the large cluster is the normal cluster, and the small cluster is the abnormal cluster.
进一步地,为了消除变量间的量纲关系,从而使数据具有可比性,在对特征值标注之前,对各个特征集中的特征信息值进行标准化。例如,在得到的每一次访问记录的特征集中可能包括百分制的变量与一个5分值的变量,只有将所有的数据标准化,才能够在同一标准中进行比较。Furthermore, in order to eliminate the dimensional relationship between variables and make the data comparable, before labeling the feature values, the feature information values in each feature set are standardized. For example, the feature set of each access record obtained may include a variable with a percentile system and a variable with a value of 5 points. Only when all the data are standardized can they be compared in the same standard.
S220、通过终端设备上的脚本程序获取所述终端设备当前访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集。S220: Obtain the second device parameter currently accessed by the terminal device through a script program on the terminal device, extract characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set.
为了可实时判定所述终端设备的网络访问是否安全状态,根据判定需要,对所述终端设备当前每一次访问的状态进行判定。在本步骤中,所述服务器通过网络连接,向所述终端设备提供脚本程序,以获取所述终端设备当前每一次访问的第二设备参数。所述第二设备参数与所述第一设备参数的性质相同。所述第二非线性组合特征集为当前网络访问获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据。In order to be able to determine in real time whether the network access of the terminal device is in a secure state, the current state of each access of the terminal device is determined according to the determination needs. In this step, the server provides a script program to the terminal device through a network connection to obtain the second device parameter that the terminal device currently accesses each time. The second device parameter has the same nature as the first device parameter. The second non-linear combination feature set is non-linear feature information of the terminal device obtained by current network access, and the feature information includes attribute data and access data of the terminal device.
服务器根据所述第二设备参数进行解析,提取得到所述第二设备参数的特征信息,根据所述特征信息得到关于当前向服务器发出网络请求的终端设备的第二非线性组合特征集。所述第二非线性组合特征集所包括的特征信息至少与所述第一非线性组合特征集种的特征信息对应,以便后续进行对比。The server performs analysis according to the second device parameter, extracts and obtains the characteristic information of the second device parameter, and obtains, according to the characteristic information, a second non-linear combined characteristic set of the terminal device that is currently sending a network request to the server. The feature information included in the second nonlinear combination feature set corresponds to at least the feature information of the first nonlinear combination feature set for subsequent comparison.
S230、采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值。S230. Using an unsupervised clustering outlier detection algorithm, using the data points of the second nonlinear combination feature set as detection parameters, and calculating the second non-linear combination feature set according to the clustering of the first nonlinear combination feature set. For the local outlier factor of the data points of the linear combination feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold.
在本申请提供的一种网络访问的安全判定方法,所述第二非线性组合特征集在该网络访问的安全判定方法作为待检样本,在采用无监督聚类的离群点检测算法构建对应的检测模型的过程中,以所述第二非线性组合特征集的数据点为检测参数。In the method for security determination of network access provided in this application, the security determination method of the second non-linear combination feature set in the network access is used as the sample to be tested, and the corresponding outlier detection algorithm using unsupervised clustering is constructed. In the process of detecting the model, the data points of the second nonlinear combination feature set are used as the detecting parameters.
所述第二非线性组合特征集的数据点是离散的,相对于所述第一设备参数的特征信息的数据点而言,可能有部分靠近所述第一设备参数的特征信息的大部分数据点在空间上形成的集合,相比较而言,有部分会远离所述集合形成离群点。根据所述计算得到对应第二非线性组合特征集的数据 点的局部离群点因子,以便通过作为检测样本的所述第二非线性组合特征集得到与所述第一非线性组合特征集的数据点的空间位置关系,从而可根据相关空间位置关系得到的数值对所述终端设备发起网络访问是否为安全状态。在本实施例中,相关空间位置关系以所述局部离群点因子的最大陡点的值进行表征,以所述最大陡点的值作为判定对应终端设备当前发起的网络请求是否属于正常状态。The data points of the second non-linear combination feature set are discrete, and relative to the data points of the feature information of the first device parameter, there may be a part close to most of the data of the feature information of the first device parameter In comparison, a set of points formed in space will form outliers far away from the set. According to the calculation, the local outlier factor corresponding to the data point of the second nonlinear combination feature set is obtained, so that the second nonlinear combination feature set as a detection sample is used to obtain the difference between the data point and the first nonlinear combination feature set. The spatial position relationship of the data points, so that whether it is a safe state to initiate network access to the terminal device according to the value obtained from the relevant spatial position relationship. In this embodiment, the relevant spatial position relationship is characterized by the value of the maximum steep point of the local outlier factor, and the value of the maximum steep point is used to determine whether the network request currently initiated by the corresponding terminal device is in a normal state.
S240、当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问。S240: When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access.
在该步骤中,根据步骤S230得到的所述将所述第二非线性组合特征集的数据点的局部离群点因子的值与最大陡点的值进行比较,根据比较结果,判定对应终端设备当前发起的网络请求是否属于正常状态。In this step, the value of the local outlier factor of the data point of the second non-linear combination feature set is compared with the value of the maximum steep point according to the value obtained in step S230, and the corresponding terminal device is determined according to the comparison result Whether the current network request is normal.
如果所述第二非线性组合特征集的数据点的值大于所述判定阈值,则判定所述终端设备当前发起的网络访问为安全访问;否则,为非安全访问。If the value of the data point of the second nonlinear combination feature set is greater than the determination threshold, it is determined that the network access currently initiated by the terminal device is a secure access; otherwise, it is an insecure access.
本申请提供的一种网络访问的安全判定方法,通过将终端设备当前发起的网络访问得到的第二非线性组合特征集的数据点与历史的终端设备发起网络访问的第一非线性组合特征集的数据点的分簇的空间位置,及计算所述数据点的局部离群点因子,并以单个局部离群点因子与所有的局部离群点因子形成的最大陡点的值作为判定阈值进行比较,得到所述终端设备发起的网络访问是否为安全访问的判定结果。本申请将所述终端设备网络访问所产生数据形成第一、第二非线性组合特征集,并根据计算得到的局部离群点因子为依据,对所述终端设备发起的网络访问是否为异常访问进行判定,这样,避免现有技术中仅对用户使用终端设备的所产生的使用记录如用户验证过程中的点击和拖动轨迹等终端设备的数据作为安全检测的依据所造成容易将真实用户判别为安全用户的问题,更为准确地反应当前终端设备向服务器发起的网络访问请求的状态,并以更为简单、直观的数据对比方式得到安全检测的结果,有利于提高网络访问的安全检测的效率。The method for determining the security of network access provided by this application combines the data points of the second non-linear combination feature set obtained by the network access currently initiated by the terminal device with the historical first non-linear combination feature set of the network access initiated by the terminal device The spatial position of the data points in the cluster, and the local outlier factor of the data point is calculated, and the maximum steep point formed by a single local outlier factor and all local outlier factors is used as the judgment threshold By comparison, the result of determining whether the network access initiated by the terminal device is a secure access is obtained. This application forms the first and second non-linear combination feature sets from the data generated by the terminal device network access, and based on the calculated local outlier factor, whether the network access initiated by the terminal device is an abnormal access Make judgments, so as to avoid the fact that in the prior art, only the use records generated by the user using the terminal equipment, such as the click and drag track during the user authentication process, are used as the basis for security detection, which is easy to distinguish the real user For the safety of users, it can more accurately reflect the status of the current network access request initiated by the terminal device to the server, and obtain the security detection result in a simpler and more intuitive way of data comparison, which is beneficial to improve the security detection of network access effectiveness.
参考图3,图3是另一个实施例的网络访问的安全判定方法的流程图,在上述方案描述的基础上,步骤S230可包括:Referring to FIG. 3, FIG. 3 is a flowchart of a method for determining network access security according to another embodiment. Based on the description of the above solution, step S230 may include:
S231、采用无监督聚类的离群点检测算法,将所述第一非线性组合特征集划为大簇和小簇;S231. Using an outlier detection algorithm of unsupervised clustering, divide the first nonlinear combination feature set into a large cluster and a small cluster;
S232、根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子;S232. According to the data points of the second nonlinear combination feature set, and using the large clusters and small clusters of the first nonlinear combination feature set, calculate the large clusters of data points corresponding to the second nonlinear combination feature set. The first local outlier factor of or the second local outlier factor of a small cluster of data points corresponding to the second nonlinear combination feature set;
其中,所述大簇和小簇是根据包含数据点的个数,并按照设定比例值进行划分。Wherein, the large cluster and the small cluster are divided according to the number of data points contained and according to a set ratio value.
根据采用无监督聚类的离群点检测算法的无监督聚类,将所述第一非线性组合特征集划分为大簇和小簇。其中,具体地分簇方法根据所述第一非线性组合特征集中的不同的类别分为若干个簇,每个簇有各自的中心点。在分簇后,按照各个簇的包含数据点的个数进行降序排列,并按照针对数据点个数所设定比例值进行大簇和小簇的划分。在本实施例中,所述设定比例值为90%,即将集中所有数据点个数的90%的数据点所形成的簇设定为大簇,其余的根据数据点的空间分布设定为小簇。According to unsupervised clustering using an outlier detection algorithm of unsupervised clustering, the first nonlinear combination feature set is divided into large clusters and small clusters. Specifically, the clustering method is divided into several clusters according to different categories in the first nonlinear combination feature set, and each cluster has its own center point. After clustering, the clusters are arranged in descending order according to the number of data points contained in each cluster, and large clusters and small clusters are divided according to the set ratio value for the number of data points. In this embodiment, the set ratio value is 90%, that is, the cluster formed by 90% of the data points in the concentration is set as a large cluster, and the rest is set as a large cluster according to the spatial distribution of the data points Small clusters.
利用上述对所述第一非线性组合特征集分簇,对从步骤S230得到的作为待检样本的所述第二非线性组合特征集的数据点进行局部离群点因子的计算。根据所述第二非线性组合特征集的数据点分别与上述将所述第一非线性组合特征集划分得到的大簇和小簇的空间位置关系,得到相应的大簇的第一局部离群点因子或对应数据点的小簇的第二局部离群点因子。对于上述的步骤S232,在本实施例中可具体包括以下步骤:Using the foregoing clustering of the first nonlinear combination feature set, the local outlier factor calculation is performed on the data points of the second nonlinear combination feature set obtained from step S230 as the sample to be tested. According to the spatial position relationship between the data points of the second nonlinear combination feature set and the large clusters and small clusters obtained by dividing the first nonlinear combination feature set, the first local outliers of the corresponding large clusters are obtained The point factor or the second local outlier factor corresponding to a small cluster of data points. For the above step S232, the following steps may be specifically included in this embodiment:
A1、根据所述第二非线性组合特征集的数据点,分别得到所述数据点与所述大簇的第一距离和所述小簇的第二距离;A1. According to the data points of the second nonlinear combination feature set, the first distance between the data point and the large cluster and the second distance between the small cluster are respectively obtained;
A2、若所述第一距离小于所述第二距离,求取所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子,其中,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子为所述大簇的大小值与所述数据点与所述大簇的相似性的乘积;A2. If the first distance is smaller than the second distance, obtain the first local outlier factor of a large cluster of data points in the second nonlinear combination feature set, wherein the second nonlinear combination The first local outlier factor of the large cluster of data points of the feature set is the product of the size value of the large cluster and the similarity between the data point and the large cluster;
A3、若所述第一距离大于所述第二距离,求取所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子,其中,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子为所述小簇的大小值与所述数据点与最接近的所述大簇的相似性的乘积。A3. If the first distance is greater than the second distance, obtain a second local outlier factor of a small cluster of data points in the second nonlinear combination feature set, wherein the second nonlinear combination The second local outlier factor of the small cluster of data points of the feature set is the product of the size value of the small cluster and the similarity between the data point and the closest large cluster.
在计算计算对应的数据点的大簇的第一局部离群点因子或对应数据点的小簇的第二局部离群点因子之前,先判断所述数据点所靠近的分簇, 并根据其所靠近的分簇的相关参数计算对应的簇的局部离群因子。Before calculating the first local outlier factor of the large cluster of the corresponding data point or the second local outlier factor of the small cluster of the corresponding data point, first determine the cluster that the data point is close to, and based on it Calculate the local outlier factor of the corresponding cluster with the related parameters of the clusters that are close to.
具体地,根据所述第二非线性组合特征集的数据点,分别得到其与所述大簇的第一距离和所述小簇的第二距离,该距离为所述第二非线性组合特征集的数据点与所述大簇和所述小簇的中心的距离。Specifically, according to the data points of the second nonlinear combination feature set, the first distance from the large cluster and the second distance from the small cluster are obtained respectively, and the distance is the second nonlinear combination feature The distance between the data points of the set and the center of the large cluster and the small cluster.
当所述第一距离小于所述第二距离时,即所述第二非线性组合特征集的数据点跟靠近所述大簇,求取所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子。所述第一局部离群点因子为大簇的大小值与所述第二非线性组合特征集的数据点与所述大簇的相似性的乘积。When the first distance is smaller than the second distance, that is, the data point of the second nonlinear combination feature set is close to the large cluster, the large cluster of the data point of the second nonlinear combination feature set is calculated The first local outlier factor of the cluster. The first local outlier factor is a product of the size of a large cluster and the similarity between the data points of the second nonlinear combination feature set and the large cluster.
当所述第一距离大于所述第二距离时,即所述第二非线性组合特征集的数据点跟靠近所述小簇,求取所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子。所述第二局部离群点因子第一局部离群点因子。When the first distance is greater than the second distance, that is, the data point of the second nonlinear combination feature set is close to the small cluster, the small cluster of the data point of the second nonlinear combination feature set is calculated The second local outlier factor of the cluster. The second local outlier factor and the first local outlier factor.
对于上述关于所述第一局部离群点因子和所述第二局部离群点因子的计算中,其中,所述分簇的大小,即所述大簇的大小值或所述小簇的大小值可以通过对应所述多个第一非线性组合特征集的数据点个数进行衡量。例如,可以直接取对应分簇的数据点个数作为对应分簇的大小值,或者,也可以其对应分簇的数据点个数在所有第一非线性组合特征集的数据点中的占比。For the calculation of the first local outlier factor and the second local outlier factor, the size of the cluster is the size of the large cluster or the size of the small cluster The value can be measured by the number of data points corresponding to the plurality of first nonlinear combination feature sets. For example, the number of data points corresponding to the cluster can be directly taken as the size value of the corresponding cluster, or it can also be the proportion of the number of data points corresponding to the cluster in all the data points of the first nonlinear combination feature set .
同时,所述大簇的相似性可以通过所述第二非线性组合特征集的数据点与所述大簇的中心的距离进行衡量。例如直接以所述第二非线性组合特征集的数据点与所述大簇的中心的距离作为该大簇的相似性。At the same time, the similarity of the large cluster can be measured by the distance between the data point of the second nonlinear combination feature set and the center of the large cluster. For example, the distance between the data point of the second nonlinear combination feature set and the center of the large cluster is directly used as the similarity of the large cluster.
通过对所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子和所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子的计算,得到判定所述第二非线性组合特征集的数据点对应的终端设备对应网络访问是否安全访问的依据。By comparing the first local outlier factor of the large cluster of data points in the second nonlinear combination feature set and the second local outlier factor of the small cluster of data points in the second nonlinear combination feature set By calculation, a basis for determining whether the network access corresponding to the terminal device corresponding to the data point of the second nonlinear combination feature set is safe to access is obtained.
对于步骤S230中的以所述局部离群点因子的最大陡点的值作为判定阈值的步骤,可进一步为:Regarding the step of using the maximum steep point value of the local outlier factor as the judgment threshold in step S230, it may be further as follows:
通过选取所有所述数据点的局部离群点因子中斜率最大的局部离群点因子的值作为最大陡点的值,并以所述最大陡点的值作为判定阈值。The value of the local outlier factor with the largest slope among the local outlier factors of all the data points is selected as the value of the maximum steep point, and the value of the maximum steep point is used as the judgment threshold.
根据上述得到的关于所有所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子和所述第二非线性组合特征集的数据点的小簇 的第二局部离群点因子,分别形成对应的曲线,得到各自曲线对应的最大陡点,而该最大陡点为关于对应第一或第二局部离群点因子的曲线的斜拉最大的局部离群点因子,并以所述最大的局部离群点因子的值作为所述最大陡点的值,以所述最大陡点的值作为判定所述第二非线性组合特征集的数据点所对应终端设备网络访问是否安全访问的判定阈值。According to the above-obtained first local outlier factor of the large clusters of data points of the second nonlinear combination feature set and the second local outliers of the small clusters of data points of the second nonlinear combination feature set The cluster point factors respectively form corresponding curves to obtain the maximum steepness corresponding to the respective curves, and the maximum steepness is the maximum local outlier factor with respect to the slope of the curve corresponding to the first or second local outlier factor, And the value of the largest local outlier factor is used as the value of the maximum steep point, and the value of the maximum steep point is used to determine the network access of the terminal device corresponding to the data point of the second nonlinear combination feature set The threshold for determining whether to access security.
在上述基础上,步骤S240包括:Based on the foregoing, step S240 includes:
当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子的值大于所述第一局部离群点因子对应的判定阈值时;或,When the first distance is greater than the second distance, the value of the first local outlier factor of the large cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the first local outlier factor时; or,
当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子的值大于所述第二局部离群点因子对应的判定阈值时,判定所述网络访问为安全访问。When the first distance is greater than the second distance, the value of the second local outlier factor of the small cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the second local outlier factor When, it is determined that the network access is safe access.
具体地,对于靠近所述大簇的所述第二非线性组合特征集的数据点,其对应的第一距离小于第二距离时,若所述得到的第二非线性组合特征集的数据点的大簇的第一局部离群点因子的值大于所述第一局部离群点因子形成曲线得到的判定阈值,则判定所述终端设备对应的网络访问为安全访问。Specifically, for the data points of the second nonlinear combination feature set close to the large cluster, when the corresponding first distance is less than the second distance, if the obtained data points of the second nonlinear combination feature set If the value of the first local outlier factor of the large cluster is greater than the determination threshold obtained by forming the curve of the first local outlier factor, it is determined that the network access corresponding to the terminal device is a safe access.
对于靠近所述小簇的所述第二非线性组合特征集的数据点,其对应的第一距离大于第二距离时,若所述得到的第二非线性组合特征集的数据点的小簇的第二局部离群点因子的值大于所述第二局部离群点因子形成曲线得到的判定阈值,则判定所述终端设备对应的网络访问为安全访问。For the data points of the second nonlinear combination feature set close to the small cluster, when the corresponding first distance is greater than the second distance, if the obtained data points of the second nonlinear combination feature set are small clusters If the value of the second local outlier factor is greater than the determination threshold obtained by forming a curve of the second local outlier factor, it is determined that the network access corresponding to the terminal device is a safe access.
对于终端设备当前发起的网络请求被判定为安全访问请求,直接响应请求。否则,服务器直接拒绝请求或重新要求所述终端设备进行访问验证。The network request currently initiated by the terminal device is determined to be a secure access request, and the request is directly responded to. Otherwise, the server directly rejects the request or re-requires the terminal device to perform access verification.
对于上述提到的所述第一非线性组合特征集或所述第二非线性组合特征集分别包括:The aforementioned first nonlinear combination feature set or the second nonlinear combination feature set respectively include:
通过对所述第一非线性组合特征集或所述第二非线性组合特征集的数据点进行度量数据散布计算得到的识别离群点的有效衍生特征信息。Effective derivative feature information for identifying outliers obtained by performing measurement data dispersion calculation on the data points of the first nonlinear combination feature set or the second nonlinear combination feature set.
具体地,所述第一非线性组合特征集或所述第二非线性组合特征集可以包括浏览器语言、像素比、颜色深度、音频堆栈指纹是否提供、音频堆栈指纹的参数信息、系统对用户代理可用的逻辑处理器总数、浏览器生产厂商是否为other、操作系统生产厂商是否为other、浏览器类型是否为robot 等原始类别的特征信息。Specifically, the first non-linear combination feature set or the second non-linear combination feature set may include browser language, pixel ratio, color depth, whether an audio stack fingerprint is provided, parameter information of the audio stack fingerprint, and the system to the user The total number of logical processors available to the agent, whether the browser manufacturer is other, whether the operating system manufacturer is other, and whether the browser type is robot and other primitive types of feature information.
根据所述度量数据散布计算,可以得到识别离群点的有效衍生特征,其包括是否安装AdBlock、用户是否篡改了语言、用户是否篡改了屏幕分辨率、用户是否篡改了操作系统、浏览器生产厂商、操作系统生产厂商、访问设备类型、操作系统家族。According to the calculation of the metric data distribution, effective derivative features for identifying outliers can be obtained, including whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, whether the user has tampered with the operating system, and browser manufacturer , Operating system manufacturer, access device type, operating system family.
所述度量数据散布计算包括对应特征信息数据计算极差、四分位数、四分位数极差、五数概括,所述五数概括按次序为最小值、上四分位、中位数、下四分位数、最大值。The metric data distribution calculation includes calculating the range, the quartile, the quartile range, and the five-number summary corresponding to the characteristic information data, and the five-number summary is the minimum, the upper quartile, and the median in order. , Lower quartile, maximum value.
基于与上述网络访问的安全判定方法相同的发明构思,本申请实施例还提供了一种网络访问的安全判定装置,如图4所示,包括:Based on the same inventive concept as the foregoing network access security determination method, an embodiment of the present application also provides a network access security determination device, as shown in FIG. 4, including:
第一生成模块410,用于根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;The first generating module 410 is configured to obtain characteristic information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first nonlinear combination characteristic sets;
第二生成模块420,用于通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;The second generating module 420 is configured to obtain the second device parameters currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
计算模块430,用于采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;The calculation module 430 is configured to use an outlier detection algorithm of unsupervised clustering, use the data points of the second nonlinear combination feature set as detection parameters, and calculate the results according to the clustering of the first nonlinear combination feature set. For the local outlier factor of the data points of the second nonlinear combination feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
判定模块440,用于当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问。The determination module 440 is configured to determine that the network access is a safe access when the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold.
其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
请参考图5,图5为一个实施例中服务器的内部结构示意图。如图4所示,该服务器包括通过系统总线连接的处理器510、存储介质520、存储器530和网络接口540。其中,该服务器的存储介质520存储有操作系统、数据库和计算机可读指令,数据库中可存储有控件信息序列,该计算机可读指令被处理器510执行时,可使得处理器510实现一种网络访问的 安全判定方法,处理器510能实现图4所示实施例中的一种网络访问的安全判定装置中的第一生成模块410、第二生成模块420、计算模块430和判定模型440的功能。该服务器的处理器510用于提供计算和控制能力,支撑整个服务器的运行。该服务器的存储器530中可存储有计算机可读指令,该计算机可读指令被处理器510执行时,可使得处理器510执行一种网络访问的安全判定方法。该服务器的网络接口540用于与终端连接通信。本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的服务器的限定,具体的服务器可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Please refer to FIG. 5, which is a schematic diagram of the internal structure of the server in an embodiment. As shown in FIG. 4, the server includes a processor 510, a storage medium 520, a memory 530, and a network interface 540 connected through a system bus. Wherein, the storage medium 520 of the server stores an operating system, a database, and computer-readable instructions. The database may store control information sequences. When the computer-readable instructions are executed by the processor 510, the processor 510 can implement a network In the access security determination method, the processor 510 can implement the functions of the first generation module 410, the second generation module 420, the calculation module 430, and the determination model 440 in a network access security determination device in the embodiment shown in FIG. 4 . The processor 510 of the server is used to provide computing and control capabilities to support the operation of the entire server. The memory 530 of the server may store computer-readable instructions, and when the computer-readable instructions are executed by the processor 510, the processor 510 can make the processor 510 execute a security determination method for network access. The network interface 540 of the server is used to connect and communicate with the terminal. Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the server to which the solution of the present application is applied. The specific server may include More or fewer components are shown in the figure, or some components are combined, or have different component arrangements.
在一个实施例中,本申请还提出了一种非易失性计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。In one embodiment, this application also proposes a non-volatile computer-readable storage medium that stores computer-readable instructions. When the computer-readable instructions are executed by one or more processors, one or more The processor executes the following steps: obtain the characteristic information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first nonlinear combination characteristic sets; obtain through the script program on the terminal device The second device parameter currently accessed by the terminal device on the network extracts the feature information of the second device parameter, and generates a second non-linear combined feature set; an outlier detection algorithm of unsupervised clustering is adopted to use the first The data points of the second nonlinear combination feature set are detection parameters, and the local outlier factor of the data points of the second nonlinear combination feature set is calculated according to the clustering of the first nonlinear combination feature set, and the local outlier factor is calculated based on the local The maximum steep point value of the outlier factor is used as the judgment threshold; when the value of the local outlier factor of the data points of the second nonlinear combination feature set is greater than the judgment threshold, it is determined that the network access is safe access; Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
综合上述实施例可知,本申请最大的有益效果在于:Based on the foregoing embodiments, it can be seen that the greatest beneficial effect of this application lies in:
本申请所提供的一种网络访问的安全判定方法和装置,对所述历史采集的终端设备的生成多个第一非线性组合特征集的数据点与终端设备当前网络访问生成的第二非线性组合特征集的数据点的空间位置得到对应的局部离群点因子,并根据所述局部离群点因子与多个局部离群点因子所得到的曲线的最大陡点的值进行比较,得到所述终端设备当前网络访问是 否为安全访问的判定结果。The security judgment method and device for network access provided by the present application are based on the data points that generate multiple first non-linear combination feature sets of the terminal equipment collected in history and the second non-linearity generated by the current network access of the terminal equipment. Combine the spatial positions of the data points of the feature set to obtain the corresponding local outlier factor, and compare the value of the maximum steep point of the curve obtained according to the local outlier factor and multiple local outlier factors to obtain The judgment result of whether the current network access of the terminal device is a safe access.
本申请所提供的技术方案运用了采用无监督聚类的离群点检测算法,得到判定依据的值并得到相应的判定结果,且不需要对终端设备发起网络访问的特征信息数据进行标注,节省了后期统计和分析的工作量;而且该方案使相应数据实现可视化,结果直观,可容易得到准确率较高的判定结果,最终提高所述网络访问的安全判定方法和装置的判定效果。The technical solution provided in this application uses an outlier detection algorithm using unsupervised clustering to obtain the value of the judgment basis and obtain the corresponding judgment result, and does not need to mark the characteristic information data of the terminal device to initiate network access, saving The workload of post-statistics and analysis is improved; and the solution enables the visualization of corresponding data, the result is intuitive, the judgment result with higher accuracy can be easily obtained, and the judgment effect of the security judgment method and device for network access is finally improved.
综上,本申请通过网络访问的安全判定方法和装置,通过无监督聚类的离群点检测算法直接对终端设备网络访问所生成的特征信息数据进行分析,并得到判定是否为安全访问的判定结果的技术方案,解决了现有技术中通过终端设备登录网络时用户的使用痕迹数据容易将真实用户辨认为安全用户的问题,提高了对终端设备安全访问的判定能力。In summary, the security determination method and device for network access in this application directly analyzes the characteristic information data generated by the network access of the terminal device through the outlier detection algorithm of unsupervised clustering, and obtains the determination whether it is a secure access The resulting technical solution solves the problem in the prior art that the user's usage trace data when logging in to the network through a terminal device can easily identify a real user as a safe user, and improves the ability to determine safe access to the terminal device.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等存储介质,或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program, which can be stored in a non-volatile computer readable storage medium When the program is executed, it may include the procedures of the above-mentioned method embodiments. Among them, the aforementioned storage medium may be a storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-mentioned embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above-mentioned embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, All should be considered as the scope of this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their descriptions are relatively specific and detailed, but they should not be understood as a limitation on the patent scope of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种网络访问的安全判定方法,包括以下步骤:A method for determining the security of network access includes the following steps:
    根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first non-linear combination feature sets;
    通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
    采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
    当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
    其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
  2. 根据权利要求1所述的方法,所述采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数值点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子的步骤,包括:The method according to claim 1, wherein the outlier detection algorithm using unsupervised clustering uses the numerical points of the second nonlinear combination feature set as detection parameters, and according to the first nonlinear combination feature set The step of clustering and calculating the local outlier factor of the data points of the second nonlinear combination feature set includes:
    采用无监督聚类的离群点检测算法,将所述第一非线性组合特征集划为大簇和小簇;Using an outlier detection algorithm of unsupervised clustering, divide the first nonlinear combination feature set into a large cluster and a small cluster;
    根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子;According to the data points of the second non-linear combination feature set, and using the large clusters and small clusters of the first non-linear combination feature set, respectively calculate the second non-linear combination feature set of the data points of the large cluster. A local outlier factor or a second local outlier factor of a small cluster of data points corresponding to the second nonlinear combination feature set;
    其中,所述大簇和小簇是根据包含数据点的个数,并按照设定比例值进行划分。Wherein, the large cluster and the small cluster are divided according to the number of data points contained and according to a set ratio value.
  3. 根据权利要求2所述的方法,所述根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子的步骤,包括:The method according to claim 2, wherein the large clusters and small clusters of the first nonlinear combination feature set are used to calculate the corresponding second nonlinearity based on the data points of the second nonlinear combination feature set. The step of combining the first local outlier factor of the large cluster of data points of the feature set or the second local outlier factor of the small cluster of data points corresponding to the second nonlinear combination feature set includes:
    根据所述第二非线性组合特征集的数据点,分别得到所述数据点与所述大簇的第一距离和所述小簇的第二距离;Obtaining the first distance between the data point and the large cluster and the second distance between the small cluster according to the data points of the second nonlinear combination feature set;
    若所述第一距离小于所述第二距离,求取所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子,其中,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子为所述大簇的大小值与所述数据点与所述大簇的相似性的乘积;If the first distance is smaller than the second distance, the first local outlier factor of a large cluster of data points of the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The first local outlier factor of the large cluster of data points is the product of the size value of the large cluster and the similarity between the data point and the large cluster;
    若所述第一距离大于所述第二距离,求取所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子,其中,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子为所述小簇的大小值与所述数据点与最接近的所述大簇的相似性的乘积。If the first distance is greater than the second distance, the second local outlier factor of a small cluster of data points in the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The second local outlier factor of the small cluster of data points is the product of the size of the small cluster and the similarity between the data point and the closest large cluster.
  4. 根据权利要求3所述的方法,所述大簇的大小值或所述小簇的大小值通过对应所述多个第一非线性组合特征集的数据点个数进行衡量;The method according to claim 3, wherein the size value of the large cluster or the size value of the small cluster is measured by the number of data points corresponding to the plurality of first nonlinear combination feature sets;
    所述大簇的相似性通过所述第二非线性组合特征集的数据点与所述大簇的中心的距离进行衡量。The similarity of the large cluster is measured by the distance between the data point of the second nonlinear combination feature set and the center of the large cluster.
  5. 根据权利要求4所述的方法,所述以所述局部离群点因子的最大陡点的值作为判定阈值的步骤,包括:The method according to claim 4, wherein the step of using the maximum steep point value of the local outlier factor as a judgment threshold comprises:
    通过选取所有所述第二非线性组合特征集的数据点的局部离群点因子中斜率最大的局部离群点因子的值作为最大陡点的值,并以所述最大陡点的值作为判定阈值。By selecting the value of the local outlier factor with the largest slope among the local outlier factors of the data points of the second nonlinear combination feature set as the value of the maximum steep point, and taking the value of the maximum steep point as the judgment Threshold.
  6. 根据权利要求5所述的方法,所述当所述第二非线性组合特征集的数据点的值大于所述判定阈值,判定所述网络访问为安全访问的步骤,包括:The method according to claim 5, wherein when the value of the data point of the second nonlinear combination feature set is greater than the judgment threshold, the step of judging that the network access is a safe access comprises:
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子的值大于所述第一局部离群点因子对应的判定阈值时;或,When the first distance is greater than the second distance, the value of the first local outlier factor of the large cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the first local outlier factor时; or,
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点 的小簇的第二局部离群点因子的值大于所述第二局部离群点因子对应的判定阈值时,判定所述网络访问为安全访问。When the first distance is greater than the second distance, the value of the second local outlier factor of the small cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the second local outlier factor When, it is determined that the network access is safe access.
  7. 根据权利要求1所述的方法,所述第一非线性组合特征集或所述第二非线性组合特征集分别包括:The method according to claim 1, wherein the first nonlinear combination feature set or the second nonlinear combination feature set respectively comprise:
    通过对所述第一非线性组合特征集或所述第二非线性组合特征集的数据点进行度量数据散布计算得到的识别离群点的有效衍生特征信息。Effective derivative feature information for identifying outliers obtained by performing measurement data dispersion calculation on the data points of the first nonlinear combination feature set or the second nonlinear combination feature set.
  8. 一种网络访问的安全判定装置,包括:A security judging device for network access, including:
    第一生成模块,用于根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;The first generating module is configured to obtain the characteristic information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first nonlinear combination characteristic sets;
    第二生成模块,用于通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;The second generation module is configured to obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
    计算模块,用于采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;The calculation module is configured to adopt an outlier detection algorithm of unsupervised clustering, use the data points of the second nonlinear combination feature set as detection parameters, and calculate the said first nonlinear combination feature set according to the clustering The local outlier factor of the data point of the second nonlinear combination feature set, and the maximum steep point value of the local outlier factor is used as the judgment threshold;
    判定模块,用于当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;A determining module, configured to determine that the network access is a safe access when the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold;
    其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
  9. 一种服务器,包括:A server that includes:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行上述网络访问的安全判定方法,其中,所述网络访问的安全判定方法,包括:One or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, and the one or more computer programs are configured to execute The security determination method for network access described above, wherein the security determination method for network access includes:
    根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参数的特征信息,并生成多个第一非线性组合特征集;Obtain feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generate a plurality of first non-linear combination feature sets;
    通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
    采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
    当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
    其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
  10. 根据权利要求9所述的服务器,所述采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数值点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子的步骤,包括:The server according to claim 9, wherein the outlier detection algorithm using unsupervised clustering uses the numerical points of the second non-linear combination feature set as detection parameters, according to the first non-linear combination feature set The step of clustering and calculating the local outlier factor of the data points of the second nonlinear combination feature set includes:
    采用无监督聚类的离群点检测算法,将所述第一非线性组合特征集划为大簇和小簇;Using an outlier detection algorithm of unsupervised clustering, divide the first nonlinear combination feature set into a large cluster and a small cluster;
    根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子;According to the data points of the second non-linear combination feature set, and using the large clusters and small clusters of the first non-linear combination feature set, respectively calculate the second non-linear combination feature set of the data points of the large cluster. A local outlier factor or a second local outlier factor of a small cluster of data points corresponding to the second nonlinear combination feature set;
    其中,所述大簇和小簇是根据包含数据点的个数,并按照设定比例值进行划分。Wherein, the large cluster and the small cluster are divided according to the number of data points contained and according to a set ratio value.
  11. 根据权利要求10所述的服务器,所述根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子的步骤,包括:11. The server according to claim 10, wherein the data points of the second nonlinear combination feature set are used to calculate the corresponding second nonlinearity by using large clusters and small clusters of the first nonlinear combination feature set. The step of combining the first local outlier factor of the large cluster of data points of the feature set or the second local outlier factor of the small cluster of data points corresponding to the second nonlinear combination feature set includes:
    根据所述第二非线性组合特征集的数据点,分别得到所述数据点与所 述大簇的第一距离和所述小簇的第二距离;Obtaining the first distance between the data point and the large cluster and the second distance between the small cluster according to the data points of the second nonlinear combination feature set;
    若所述第一距离小于所述第二距离,求取所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子,其中,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子为所述大簇的大小值与所述数据点与所述大簇的相似性的乘积;If the first distance is smaller than the second distance, the first local outlier factor of a large cluster of data points of the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The first local outlier factor of the large cluster of data points is the product of the size value of the large cluster and the similarity between the data point and the large cluster;
    若所述第一距离大于所述第二距离,求取所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子,其中,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子为所述小簇的大小值与所述数据点与最接近的所述大簇的相似性的乘积。If the first distance is greater than the second distance, the second local outlier factor of a small cluster of data points in the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The second local outlier factor of the small cluster of data points is the product of the size of the small cluster and the similarity between the data point and the closest large cluster.
  12. 根据权利要求11所述的服务器,所述大簇的大小值或所述小簇的大小值通过对应所述多个第一非线性组合特征集的数据点个数进行衡量;The server according to claim 11, wherein the size value of the large cluster or the size value of the small cluster is measured by the number of data points corresponding to the plurality of first nonlinear combination feature sets;
    所述大簇的相似性通过所述第二非线性组合特征集的数据点与所述大簇的中心的距离进行衡量。The similarity of the large cluster is measured by the distance between the data point of the second nonlinear combination feature set and the center of the large cluster.
  13. 根据权利要求12所述的服务器,所述以所述局部离群点因子的最大陡点的值作为判定阈值的步骤,包括:The server according to claim 12, wherein the step of using the maximum steep point value of the local outlier factor as the judgment threshold comprises:
    通过选取所有所述第二非线性组合特征集的数据点的局部离群点因子中斜率最大的局部离群点因子的值作为最大陡点的值,并以所述最大陡点的值作为判定阈值。By selecting the value of the local outlier factor with the largest slope among the local outlier factors of the data points of the second nonlinear combination feature set as the value of the maximum steep point, and taking the value of the maximum steep point as the judgment Threshold.
  14. 根据权利要求13所述的服务器,所述当所述第二非线性组合特征集的数据点的值大于所述判定阈值,判定所述网络访问为安全访问的步骤,包括:The server according to claim 13, wherein when the value of the data point of the second non-linear combination feature set is greater than the judgment threshold, the step of judging that the network access is a safe access comprises:
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子的值大于所述第一局部离群点因子对应的判定阈值时;或,When the first distance is greater than the second distance, the value of the first local outlier factor of the large cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the first local outlier factor时; or,
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子的值大于所述第二局部离群点因子对应的判定阈值时,判定所述网络访问为安全访问。When the first distance is greater than the second distance, the value of the second local outlier factor of the small cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the second local outlier factor When, it is determined that the network access is safe access.
  15. 一种非易失性计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述网络访问的安全判定方法,其中,所述网络访问的安全判定方法,包括:A non-volatile computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for determining the security of network access is realized, wherein the security of the network access is Judgment methods include:
    根据终端设备的历史网络访问的第一设备参数,得到所述第一设备参 数的特征信息,并生成多个第一非线性组合特征集;Obtaining feature information of the first device parameter according to the first device parameter accessed by the historical network of the terminal device, and generating a plurality of first nonlinear combination feature sets;
    通过终端设备上的脚本程序获取所述终端设备当前网络访问的第二设备参数,提取所述第二设备参数的特征信息,并生成第二非线性组合特征集;Obtain the second device parameter currently accessed by the terminal device through the script program on the terminal device, extract the characteristic information of the second device parameter, and generate a second nonlinear combination characteristic set;
    采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数据点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子,以所述局部离群点因子的最大陡点的值作为判定阈值;An outlier detection algorithm using unsupervised clustering is used, the data points of the second nonlinear combination feature set are used as detection parameters, and the second nonlinear combination is calculated according to the clustering of the first nonlinear combination feature set For the local outlier factor of the data point of the feature set, the maximum steep point value of the local outlier factor is used as the judgment threshold;
    当所述第二非线性组合特征集的数据点的局部离群点因子的值大于所述判定阈值,判定所述网络访问为安全访问;When the value of the local outlier factor of the data point of the second nonlinear combination feature set is greater than the determination threshold, determine that the network access is a safe access;
    其中,所述第一非线性组合特征集为历史网络访问获取的终端设备的非线性特征信息;所述第二非线性组合特征集为当前网络方法获取的终端设备的非线性特征信息,该特征信息包括终端设备的属性数据和访问数据;该特征信息包括终端设备的属性数据和访问数据。Wherein, the first non-linear combination feature set is the non-linear feature information of the terminal device obtained by historical network access; the second non-linear combination feature set is the non-linear feature information of the terminal device obtained by the current network method, the feature The information includes the attribute data and access data of the terminal device; the characteristic information includes the attribute data and access data of the terminal device.
  16. 根据权利要求15所述的非易失性计算机可读存储介质,所述采用无监督聚类的离群点检测算法,以所述第二非线性组合特征集的数值点为检测参数,根据所述第一非线性组合特征集的分簇计算所述第二非线性组合特征集的数据点的局部离群点因子的步骤,包括:The non-volatile computer-readable storage medium according to claim 15, wherein the outlier detection algorithm using unsupervised clustering uses the numerical points of the second non-linear combination feature set as detection parameters, according to the The clustering of the first nonlinear combination feature set and the step of calculating the local outlier factor of the data points of the second nonlinear combination feature set includes:
    采用无监督聚类的离群点检测算法,将所述第一非线性组合特征集划为大簇和小簇;Using an outlier detection algorithm of unsupervised clustering, divide the first nonlinear combination feature set into a large cluster and a small cluster;
    根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子;According to the data points of the second non-linear combination feature set, and using the large clusters and small clusters of the first non-linear combination feature set, respectively calculate the second non-linear combination feature set of the data points of the large cluster. A local outlier factor or a second local outlier factor of a small cluster of data points corresponding to the second nonlinear combination feature set;
    其中,所述大簇和小簇是根据包含数据点的个数,并按照设定比例值进行划分。Wherein, the large cluster and the small cluster are divided according to the number of data points contained and according to a set ratio value.
  17. 根据权利要求16所述的非易失性计算机可读存储介质,所述根据所述第二非线性组合特征集的数据点,并利用所述第一非线性组合特征集的大簇和小簇,分别计算对应第二非线性组合特征集的数据点的大簇的第一局部离群点因子或对应第二非线性组合特征集的数据点的小簇的第二局部离群点因子的步骤,包括:The non-volatile computer-readable storage medium according to claim 16, wherein the data points of the second nonlinear combination feature set are used and the large clusters and small clusters of the first nonlinear combination feature set are used , The step of respectively calculating the first local outlier factor of the large cluster of data points corresponding to the second nonlinear combination feature set or the second local outlier factor of the small cluster of data points corresponding to the second nonlinear combination feature set ,include:
    根据所述第二非线性组合特征集的数据点,分别得到所述数据点与所述大簇的第一距离和所述小簇的第二距离;Obtaining the first distance between the data point and the large cluster and the second distance between the small cluster according to the data points of the second nonlinear combination feature set;
    若所述第一距离小于所述第二距离,求取所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子,其中,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子为所述大簇的大小值与所述数据点与所述大簇的相似性的乘积;If the first distance is smaller than the second distance, the first local outlier factor of a large cluster of data points of the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The first local outlier factor of the large cluster of data points is the product of the size value of the large cluster and the similarity between the data point and the large cluster;
    若所述第一距离大于所述第二距离,求取所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子,其中,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子为所述小簇的大小值与所述数据点与最接近的所述大簇的相似性的乘积。If the first distance is greater than the second distance, the second local outlier factor of a small cluster of data points in the second nonlinear combination feature set is obtained, wherein the second nonlinear combination feature set The second local outlier factor of the small cluster of data points is the product of the size of the small cluster and the similarity between the data point and the closest large cluster.
  18. 根据权利要求17所述的非易失性计算机可读存储介质,所述大簇的大小值或所述小簇的大小值通过对应所述多个第一非线性组合特征集的数据点个数进行衡量;The non-volatile computer-readable storage medium according to claim 17, wherein the size value of the large cluster or the size value of the small cluster is determined by the number of data points corresponding to the plurality of first nonlinear combination feature sets Measure
    所述大簇的相似性通过所述第二非线性组合特征集的数据点与所述大簇的中心的距离进行衡量。The similarity of the large cluster is measured by the distance between the data point of the second nonlinear combination feature set and the center of the large cluster.
  19. 根据权利要求18所述的非易失性计算机可读存储介质,所述以所述局部离群点因子的最大陡点的值作为判定阈值的步骤,包括:18. The non-volatile computer-readable storage medium according to claim 18, wherein the step of using the maximum steepness value of the local outlier factor as a determination threshold comprises:
    通过选取所有所述第二非线性组合特征集的数据点的局部离群点因子中斜率最大的局部离群点因子的值作为最大陡点的值,并以所述最大陡点的值作为判定阈值。By selecting the value of the local outlier factor with the largest slope among the local outlier factors of the data points of the second nonlinear combination feature set as the value of the maximum steep point, and taking the value of the maximum steep point as the judgment Threshold.
  20. 根据权利要求19所述的非易失性计算机可读存储介质,The non-volatile computer-readable storage medium according to claim 19,
    所述当所述第二非线性组合特征集的数据点的值大于所述判定阈值,判定所述网络访问为安全访问的步骤,包括:When the value of the data point of the second non-linear combination feature set is greater than the determination threshold, the step of determining that the network access is safe access includes:
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的大簇的第一局部离群点因子的值大于所述第一局部离群点因子对应的判定阈值时;或,When the first distance is greater than the second distance, the value of the first local outlier factor of the large cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the first local outlier factor时; or,
    当所述第一距离大于第二距离,所述第二非线性组合特征集的数据点的小簇的第二局部离群点因子的值大于所述第二局部离群点因子对应的判定阈值时,判定所述网络访问为安全访问。When the first distance is greater than the second distance, the value of the second local outlier factor of the small cluster of data points in the second nonlinear combination feature set is greater than the determination threshold corresponding to the second local outlier factor When, it is determined that the network access is safe access.
PCT/CN2019/103646 2019-06-28 2019-08-30 Network access security determination method and apparatus WO2020258505A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910578479.X 2019-06-28
CN201910578479.XA CN110417744B (en) 2019-06-28 2019-06-28 Security determination method and device for network access

Publications (1)

Publication Number Publication Date
WO2020258505A1 true WO2020258505A1 (en) 2020-12-30

Family

ID=68358705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103646 WO2020258505A1 (en) 2019-06-28 2019-08-30 Network access security determination method and apparatus

Country Status (2)

Country Link
CN (1) CN110417744B (en)
WO (1) WO2020258505A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272198A (en) * 2023-09-08 2023-12-22 广东美亚商旅科技有限公司 Abnormal user generated content identification method based on business travel business data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618175A (en) * 2014-12-19 2015-05-13 上海电机学院 Network abnormity detection method
CN106294529A (en) * 2015-06-29 2017-01-04 阿里巴巴集团控股有限公司 A kind of identification user's abnormal operation method and apparatus
US20170061322A1 (en) * 2015-08-31 2017-03-02 International Business Machines Corporation Automatic generation of training data for anomaly detection using other user's data samples
US20170124478A1 (en) * 2015-10-30 2017-05-04 Citrix Systems, Inc. Anomaly detection with k-means clustering and artificial outlier injection
CN106982196A (en) * 2016-01-19 2017-07-25 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN107579956A (en) * 2017-08-07 2018-01-12 北京奇安信科技有限公司 The detection method and device of a kind of user behavior
CN108809745A (en) * 2017-05-02 2018-11-13 中国移动通信集团重庆有限公司 A kind of user's anomaly detection method, apparatus and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10992675B2 (en) * 2014-04-14 2021-04-27 Oracle International Corporation Anomaly detection using tripoint arbitration
CN106101102B (en) * 2016-06-15 2019-07-26 华东师范大学 A kind of exception flow of network detection method based on PAM clustering algorithm
CN109067725B (en) * 2018-07-24 2021-05-14 成都亚信网络安全产业技术研究院有限公司 Network flow abnormity detection method and device
CN109714311B (en) * 2018-11-15 2021-12-31 北京天地和兴科技有限公司 Abnormal behavior detection method based on clustering algorithm
CN109753991A (en) * 2018-12-06 2019-05-14 中科恒运股份有限公司 Abnormal deviation data examination method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618175A (en) * 2014-12-19 2015-05-13 上海电机学院 Network abnormity detection method
CN106294529A (en) * 2015-06-29 2017-01-04 阿里巴巴集团控股有限公司 A kind of identification user's abnormal operation method and apparatus
US20170061322A1 (en) * 2015-08-31 2017-03-02 International Business Machines Corporation Automatic generation of training data for anomaly detection using other user's data samples
US20170124478A1 (en) * 2015-10-30 2017-05-04 Citrix Systems, Inc. Anomaly detection with k-means clustering and artificial outlier injection
CN106982196A (en) * 2016-01-19 2017-07-25 阿里巴巴集团控股有限公司 A kind of abnormal access detection method and equipment
CN108809745A (en) * 2017-05-02 2018-11-13 中国移动通信集团重庆有限公司 A kind of user's anomaly detection method, apparatus and system
CN107579956A (en) * 2017-08-07 2018-01-12 北京奇安信科技有限公司 The detection method and device of a kind of user behavior

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GERHARD MÜNZ ET AL.: "Traffic Anomaly Detection Using KMeans Clustering", GI/ITG WORKSHOP MMBNET, 31 December 2007 (2007-12-31), XP055361488, DOI: 20200226144930A *
LUO, MIN ET AL.: "An Unsupervised Clustering-Based Intrusion Detection Method", ACTA ELECTRONICA SINICA, vol. 31, no. 11, 30 November 2003 (2003-11-30), DOI: 20200226144440A *
TIAN HUANG ET AL.: "An LOF-based Adaptive Anomaly Detection Scheme for Cloud Computing", 2013 IEEE 37TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, 31 December 2013 (2013-12-31), XP055770024, DOI: 20200226144729A *
YAN, XIAOGUANG ET AL.: "Application to Cluster Algorithm in Anomaly Detection of Network Intrusion", COMPUTER SYSTEMS & APPLICATIONS, no. 10, 31 December 2005 (2005-12-31), DOI: 20200226150743A *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272198A (en) * 2023-09-08 2023-12-22 广东美亚商旅科技有限公司 Abnormal user generated content identification method based on business travel business data

Also Published As

Publication number Publication date
CN110417744A (en) 2019-11-05
CN110417744B (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN113574838B (en) System and method for filtering internet traffic through client fingerprint
US20180075240A1 (en) Method and device for detecting a suspicious process by analyzing data flow characteristics of a computing device
TW201730766A (en) Method and apparatus for abnormal access detection
TWI703468B (en) Suspicious event analysis device and related computer program product for generating suspicious event sequence diagram
US20180052755A1 (en) System status visualization method and system status visualization device
US20220189008A1 (en) Method for detecting data defects and computing device utilizing method
US11315010B2 (en) Neural networks for detecting fraud based on user behavior biometrics
US11715046B2 (en) Enhancing data-analytic visualizations with machine learning
WO2020258505A1 (en) Network access security determination method and apparatus
WO2023108833A1 (en) Terminal anomalous behavior detection method and apparatus, device, and storage medium
Jacob et al. Detecting Cyber Security Attacks against a Microservices Application using Distributed Tracing.
WO2020258509A1 (en) Method and device for isolating abnormal access of terminal device
WO2020000631A1 (en) Virtual currency value estimation method and apparatus, electronic device and storage medium
Kent et al. Differentiating user authentication graphs
WO2024031881A1 (en) Operation behavior recognition method and apparatus
KR20210110765A (en) Method for providing ai-based big data de-identification solution
US9929921B2 (en) Techniques for workload toxic mapping
CN110311909B (en) Method and device for judging abnormity of network access of terminal equipment
Sapegin et al. Evaluation of in‐memory storage engine for machine learning analysis of security events
US11010342B2 (en) Network activity identification and characterization based on characteristic active directory (AD) event segments
CN113065126A (en) Personal information compliance method and device based on distributed data sandbox
CN114595765A (en) Data processing method and device, electronic equipment and storage medium
CN114285596A (en) Transformer substation terminal account abnormity detection method based on machine learning
US20210385235A1 (en) Security analysis assistance apparatus, security analysis assistance method, and computer-readable recording medium
CN112000958A (en) Method and device for detecting logic bugs during application program login and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19935331

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19935331

Country of ref document: EP

Kind code of ref document: A1