CN114399190B

CN114399190B - Risk behavior identification method and system for big data information security

Info

Publication number: CN114399190B
Application number: CN202210026614.1A
Authority: CN
Inventors: 张春艳
Original assignee: Shenzhen Dingbang Information Technology Co ltd
Current assignee: Shenzhen Dingbang Information Technology Co ltd
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2022-10-04
Anticipated expiration: 2042-01-11
Also published as: CN114399190A; CN115456390A

Abstract

The invention provides a risk behavior identification method and a system aiming at big data information security, which can optimize user operation behavior expectation description through a security threat tag pairing index, can enable the user operation behavior expectation description corresponding to the big risk behavior data of the same data information security threat tag to be in a state with smaller characteristic discrimination, and ensure that the user operation behavior expectation description corresponding to the big risk behavior data of the data information security threat tag with difference is in a state with larger characteristic discrimination as much as possible, thus being beneficial to ensuring the anti-interference performance of the user operation behavior expectation description, being beneficial to obtaining the integral positioning description (such as distribution condition) of the user operation behavior expectation description, and being beneficial to ensuring the positioning accuracy and reliability of the data information security threat tag.

Description

A method and system for identifying risk behaviors for big data information security

技术领域technical field

本发明涉及大数据技术领域，尤其涉及一种针对大数据信息安全的风险行为识别方法及系统。The invention relates to the technical field of big data, in particular to a method and system for identifying risk behaviors for big data information security.

背景技术Background technique

随着大数据时代的来临，企业数据开始激增，各种数据在云端、移动设备、关系型数据库、大数据库平台、PC端以及采集器端等多个位置分散，这对数据安全来说，存在着更大的挑战。大数据业务的多样化、数据分散、系统种类繁多以及应用环境复杂等特点，使得数据在大部分阶段都可能存在风险。为此，需要提供针对性的大数据信息安全技术措施，以采取合理的综合管控手段达到安全合规与安全防护。为了保障大数据信息安全防护的质量，前序环节通常需要对不同风险或者威胁进行识别定位处理，然而相关技术难以保障定位的精度和可信度。With the advent of the era of big data, enterprise data has begun to surge, and various data are scattered in the cloud, mobile devices, relational databases, big database platforms, PCs, and collectors. with greater challenges. The diversification of big data business, data dispersion, wide variety of systems, and complex application environment make data at risk in most stages. To this end, it is necessary to provide targeted big data information security technical measures, so as to adopt reasonable comprehensive management and control methods to achieve security compliance and security protection. In order to ensure the quality of big data information security protection, the pre-sequence link usually needs to identify and locate different risks or threats. However, it is difficult for related technologies to ensure the accuracy and credibility of the positioning.

发明内容SUMMARY OF THE INVENTION

本发明提供一种针对大数据信息安全的风险行为识别方法及系统，为实现上述技术目的，本申请采用如下技术方案。The present invention provides a risk behavior identification method and system for big data information security. In order to achieve the above technical purpose, the present application adopts the following technical solutions.

第一方面是一种针对大数据信息安全的风险行为识别方法，包括：确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数，且若干风险行为大数据涵盖用于辅助进行安全威胁标签定位的风险行为大数据和待进行安全威胁标签定位的风险行为大数据，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组，安全威胁标签配对指数表示风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价；通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述；通过完成优化的用户操作行为期望描述，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况；其中，安全威胁标签定位情况旨在反映待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签。The first aspect is a risk behavior identification method for big data information security, including: determining the user operation behavior expectation description of several risk behavior big data and a security threat tag pairing index that is not less than a risk behavior big data binary group, And certain risk behavior big data includes risk behavior big data used to assist in the positioning of security threat labels and risk behavior big data to be used for security threat label localization, and every two risk behavior big data in certain risk behavior big data is regarded as a risk behavior. Big data tuples, security threat tag pairing index represents the quantitative evaluation of risk behavior big data tuples pointing to the same data information security threat tag; through the security threat tag pairing index, the user operation behavior expectation description of some risk behavior big data is optimized. ;By completing the optimized description of user operation behavior expectations, the security threat label positioning situation of the big data of risk behaviors to be used for security threat label localization is obtained; wherein, the security threat label localization situation aims to reflect the large risk behavior of the security threat label localization to be performed. The data information security threat label corresponding to the data.

实施上述内容，确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数，且若干风险行为大数据涵盖用于辅助进行安全威胁标签定位的风险行为大数据和待进行安全威胁标签定位的风险行为大数据，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组，安全威胁标签配对指数表示风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价，并通过安全威胁标签配对指数，优化用户操作行为期望描述，从而通过完成优化的用户操作行为期望描述，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况，且安全威胁标签定位情况旨在反映待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签。Implement the above content, determine the expected description of user operation behavior of certain risk behavior big data and the security threat tag pairing index that is not less than one risk behavior big data 2-tuple, and some risk behavior big data covers are used to assist in the positioning of security threat tags The big data of risk behaviors and the big data of risk behaviors to be identified by security threat labels, every two big data of risk behaviors in several big data of risk behaviors is regarded as a big data tuples of risk behaviors, and the pairing index of security threat labels indicates that the risk behaviors are large. The data tuples point to the quantitative evaluation of the security threat tag of the same data information, and optimize the user operation behavior expectation description through the security threat tag pairing index, so that by completing the optimized user operation behavior expectation description, the security threat tag to be located is obtained. The location of the security threat label of the risk behavior big data, and the security threat label location is intended to reflect the data information security threat label corresponding to the risk behavior big data to be located.

如此一来，通过通过安全威胁标签配对指数，优化用户操作行为期望描述，能够使同一数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较小的状态下，并尽量确保存在差异的数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较大的状态下，这样能够有助于保障用户操作行为期望描述的抗干扰性，同时有助于获取用户操作行为期望描述的整体性定位描述（比如分布情况），这样可以有助于保障数据信息安全威胁标签定位的精度和可信度。In this way, by optimizing the user operation behavior expectation description through the security threat tag pairing index, the user operation behavior expectation description corresponding to the risk behavior big data of the same data information security threat tag can be in a state with a small degree of feature discrimination, and Try to ensure that there are differences in the risk behavior of data information security threat labels, and the expected description of user operation behavior corresponding to big data is in a state with a large degree of feature discrimination, which can help to ensure the anti-interference of the expected description of user operation behavior. It helps to obtain the overall positioning description (such as distribution) of the expected description of user operation behavior, which can help to ensure the accuracy and credibility of the positioning of data and information security threat tags.

在一种可示性实施例中，通过完成优化的用户操作行为期望描述，确定待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况，包括：通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果，其中，分类识别结果包括待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度，先验型安全威胁标签是用于辅助进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签；基于第一标签定位置信度，得到安全威胁标签定位情况。In an illustrative embodiment, by completing the optimized user operation behavior expectation description, determining the security threat label location situation of the risk behavior big data to be performed security threat label location, including: by completing the optimized user operation behavior expectation description Perform AI-based classification and identification operations to obtain classification and identification results, wherein the classification and identification results include the location reliability of the first label that points to no less than one a priori security threat label and the risk behavior big data to be located. , the a priori security threat label is the data information security threat label corresponding to the risk behavior big data used to assist in the location of the security threat label; based on the location reliability of the first label, the location of the security threat label is obtained.

如此一来，通过通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果，且分类识别结果包括待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度，从而基于第一标签定位置信度，得到安全威胁标签定位情况，进而能够在通过安全威胁标签配对指数完成优化的用户操作行为期望描述的前提下进行识别，得到待进行安全威胁标签定位的风险行为大数据指向于不少于一种数据信息安全威胁标签的第一标签定位置信度，进而能够提高识别的精准性。In this way, by performing an AI-based classification and identification operation by completing the optimized description of user operation behavior expectations, a classification and identification result is obtained, and the classification and identification result includes the risk behavior big data to be located for the security threat label, pointing to no less than one type. Based on the location reliability of the first tag of the a priori security threat tag, based on the location reliability of the first tag, the location of the security threat tag can be obtained, and then the optimal description of user operation behavior expectations can be completed through the security threat tag pairing index. The identification is performed to obtain the location reliability of the first tag whose risk behavior big data to be located in the security threat tag is directed to no less than one type of data information security threat tag, thereby improving the accuracy of identification.

在一种可示性实施例中，分类识别结果还涵盖用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度；在基于第一标签定位置信度，得到安全威胁标签定位情况之前，方法还包括：在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过分类识别结果，优化安全威胁标签配对指数；并再次实施通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述的步骤，在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，基于第一标签定位置信度，得到安全威胁标签定位情况。In an illustrative embodiment, the classification and identification result also includes the location reliability of the second tag that is used to assist in the positioning of the security threat tag, and the risk behavior big data points to no less than one a priori security threat tag; Based on the location reliability of the first tag, before obtaining the security threat tag positioning situation, the method further includes: optimizing the security threat tag pairing index through the classification and identification results on the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements; And once again implement the steps of optimizing the user operation behavior expectation description of several risk behavior big data through the security threat tag pairing index. On the basis that the accumulated value of the AI-based classification and identification operation does not meet the specified requirements, the first tag is based on the location information. degree to get the location of security threat tags.

如此设计，通过将分类识别结果配置成还涵盖用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，并在基于第一标签定位置信度，得到安全威胁标签定位情况之前，进一步在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过分类识别结果，优化安全威胁标签配对指数，且再次实施通过安全威胁标签配对指数，优化用户操作行为期望描述的步骤以及在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，基于第一标签定位置信度，得到安全威胁标签定位情况。这样能够在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，对安全威胁标签配对指数进行优化，进而提高安全威胁标签差异化程度的抗干扰性，同时不断通过完成优化的安全威胁标签差异化程度，对用户操作行为期望描述进行优化，从而又提高用户操作行为期望描述的抗干扰性，这样能够使得安全威胁标签差异化程度和用户操作行为期望描述呈现互补关系，同时在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，基于第一标签定位置信度，得到安全威胁标签定位情况，这样能够有助于提高数据信息安全威胁标签定位的精度和可信度。In this way, by configuring the classification and identification results to also cover the risk behavior big data used to assist in the positioning of security threat tags, it points to the second tag location reliability of not less than one a priori security threat tag, and based on the first A tag determines the location reliability, and before obtaining the security threat tag location, further optimizes the security threat tag pairing index through the classification and identification results on the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements, and implements the security threat tag again. The threat tag pairing index, the steps to optimize the description of user operation behavior expectations, and on the basis that the cumulative value of the AI-based classification and identification operation does not meet the specified requirements, based on the location reliability of the first tag, to obtain the security threat tag positioning situation. In this way, on the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements, the risk behavior big data to be located in the security threat tag can be directed to the first tag location of no less than one a priori security threat tag. Confidence and risk behavior big data used to assist in the positioning of security threat tags point to a second tag location confidence of no less than one a priori security threat tag, optimize the security threat tag pairing index, and then improve the security threat The anti-interference ability of the label differentiation degree, at the same time, by continuously completing the optimized security threat label differentiation degree, the user operation behavior expectation description is optimized, thereby improving the anti-interference performance of the user operation behavior expectation description, which can make the security threat label description. The degree of differentiation and the expected description of user operation behavior show a complementary relationship. At the same time, on the basis that the accumulated value of the AI-based classification and identification operation does not meet the specified requirements, based on the location reliability of the first tag, the security threat tag location information can be obtained. It is helpful to improve the accuracy and credibility of data information security threat tag location.

在一种可示性实施例中，安全威胁标签配对指数包括：各个风险行为大数据二元组指向于同一数据信息安全威胁标签的目标标签定位置信度；通过分类识别结果，优化安全威胁标签配对指数，包括：依次以若干风险行为大数据中每个风险行为大数据作为当前风险行为大数据，并将包含当前风险行为大数据的风险行为大数据二元组作为当前风险行为大数据二元组；确定当前风险行为大数据的全部当前风险行为大数据二元组的目标标签定位置信度的全局计算结果，作为当前风险行为大数据的全局量化指标；通过第一标签定位置信度和第二标签定位置信度，依次确定各个当前风险行为大数据二元组指向于同一数据信息安全威胁标签的先验型标签定位置信度；分别通过全局量化指标、先验型标签定位置信度，变更各个当前风险行为大数据二元组的目标标签定位置信度。In an illustrative embodiment, the security threat tag pairing index includes: each risk behavior big data tuples point to the target tag location reliability of the same data information security threat tag; optimizing the security threat tag pairing through the classification and identification results The index includes: sequentially taking each risk behavior big data in several risk behavior big data as the current risk behavior big data, and taking the risk behavior big data tuples including the current risk behavior big data as the current risk behavior big data tuples ; Determine the global calculation result of the location reliability of the target labels of all the current risk behavior big data two-tuples of the current risk behavior big data, as the global quantitative index of the current risk behavior big data; determine the location reliability through the first label and the second label. The location reliability is determined in turn to determine the location reliability of the a priori labels that each current risk behavior big data tuples point to the same data information security threat label; the global quantitative indicators and the location reliability of the a priori labels are used to change each current risk. Target label location reliability for behavioral big data dyads.

如此设计，将安全威胁标签配对指数配置成包括各个风险行为大数据二元组指向于同一数据信息安全威胁标签的目标标签定位置信度，并依次以若干风险行为大数据中每个风险行为大数据作为当前风险行为大数据，将包含当前风险行为大数据的风险行为大数据二元组作为当前风险行为大数据二元组，从而确定当前风险行为大数据的全部当前风险行为大数据二元组的目标标签定位置信度，作为当前风险行为大数据的全局量化指标，以及通过第一标签定位置信度和第二标签定位置信度，依次确定各个风险行为大数据二元组指向于同一数据信息安全威胁标签的先验型标签定位置信度，进而分别通过全局量化指标、先验型标签定位置信度，变更各个当前风险行为大数据二元组的目标标签定位置信度。这样能够通过各个当前风险行为大数据二元组指向于同一数据信息安全威胁标签的先验型标签定位置信度，对安全威胁标签配对指数进行优化，进而能够便于对风险行为大数据所对应的数据信息安全威胁标签进行全局性处理，同时能够提高安全威胁标签配对指数的精准性。In this way, the security threat tag pairing index is configured to include the location reliability of each risk behavior big data tuples pointing to the target tag of the same data information security threat tag, and each risk behavior big data in several risk behavior big data in turn. As the current risk behavior big data, the risk behavior big data dyad containing the current risk behavior big data is regarded as the current risk behavior big data dyad, so as to determine all the current risk behavior big data dyads of the current risk behavior big data. The location reliability of the target tag is used as a global quantitative indicator of the current risk behavior big data, and through the location reliability of the first tag and the location reliability of the second tag, it is determined in turn that each big data tuples of risk behaviors point to the same data information security threat The label's a priori label location reliability, and then the target label location reliability of each current risk behavior big data binary group is changed through the global quantitative index and a priori label location reliability respectively. In this way, the security threat tag pairing index can be optimized through the location reliability of a priori tags that point to the same data information security threat tag by each current risk behavior big data tuples, thereby facilitating the analysis of the data corresponding to the risk behavior big data. Information security threat tags are processed globally, and at the same time, the accuracy of the security threat tag pairing index can be improved.

在一种可示性实施例中，通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果，包括：通过完成优化的用户操作行为期望描述，识别待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签，其中，已识别安全威胁标签指向于不少于一个先验型安全威胁标签；对于各个风险行为大数据二元组，确定风险行为大数据二元组的安全威胁标签差异分析情况和期望描述共性指数，并获得风险行为大数据二元组对应于安全威胁标签差异分析情况和期望描述共性指数之间的第一绑定评分，其中，安全威胁标签差异分析情况反映风险行为大数据二元组所对应的已识别安全威胁标签是否一致，期望描述共性指数反映风险行为大数据二元组的用户操作行为期望描述之间的差异化程度；基于用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签和先验型安全威胁标签，得到用于辅助进行安全威胁标签定位的风险行为大数据关于已识别安全威胁标签与先验型安全威胁标签的第二绑定评分；通过第一绑定评分和第二绑定评分，得到分类识别结果。In an exemplary embodiment, an AI-based classification and identification operation is performed by completing the optimized user operation behavior expectation description to obtain a classification and identification result, including: by completing the optimized user operation behavior expectation description, identifying the security threat tag to be performed The identified security threat tags corresponding to the located risk behavior big data and the risk behavior big data used to assist in the positioning of security threat tags, wherein the identified security threat tags point to not less than one a priori security threat tag; for For each risk behavior big data dyad, determine the security threat label difference analysis situation and expectation description common index of the risk behavior big data dyad, and obtain the risk behavior big data dyad corresponding to the security threat label difference analysis situation and expectation description The first binding score between the commonality indices, in which the difference analysis of security threat labels reflects whether the identified security threat labels corresponding to the big data tuples of risk behaviors are consistent, and it is expected to describe the commonality index to reflect the big data tuples of risky behaviors. The degree of differentiation between the expected descriptions of user operation behaviors; based on the identified security threat labels and a priori security threat labels corresponding to the risk behavior big data used to assist in the localization of security threat labels, the The risk behavior big data of tag location is about the second binding score of the identified security threat label and the a priori security threat label; through the first binding score and the second binding score, the classification and identification result is obtained.

如此一来，通过完成优化的用户操作行为期望描述，识别待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签，且已识别安全威胁标签指向于不少于一个先验型安全威胁标签，从而对于各个风险行为大数据二元组，确定风险行为大数据二元组的安全威胁标签差异分析情况和期望描述共性指数，并获得风险行为大数据二元组对应于安全威胁标签差异分析情况和期望描述共性指数之间的第一绑定评分，且安全威胁标签差异分析情况反映风险行为大数据二元组所对应的已识别安全威胁标签是否一致，期望描述共性指数反映风险行为大数据二元组的用户操作行为期望描述之间的差异化程度，并基于用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签和先验型安全威胁标签，得到用于辅助进行安全威胁标签定位的风险行为大数据对应于已识别安全威胁标签与先验型安全威胁标签的第二绑定评分，进而通过第一绑定评分和第二绑定评分，得到分类识别结果。如此，通过确定风险行为大数据二元组对应于安全威胁标签差异分析情况和差异化程度的第一绑定评分，能够在已识别安全威胁标签的安全威胁标签差异分析情况以及期望描述共性指数之间的绑定评分的前提下，从任意风险行为大数据二元组的层面，反映数据信息安全威胁标签分析的精准性，并通过确定用于辅助进行安全威胁标签定位的风险行为大数据对应于已识别安全威胁标签与先验型安全威胁标签的第二绑定评分，能够在已识别安全威胁标签与先验型安全威胁标签之间的绑定评分的前提下，从个别风险行为大数据的层面，反映数据信息安全威胁标签分析的精准性，同时通过其中两个风险行为大数据和个别风险行为大数据两个层面，确定出分类识别结果，这样能够有助于提高分类识别结果识别的精确性。In this way, by completing the optimized description of user operation behavior expectations, the identified security threat labels corresponding to the risk behavior big data to be located in the security threat tag and the risk behavior big data used to assist in the security threat tag localization are identified, and The identified security threat label points to no less than one a priori security threat label, so that for each risk behavior big data dyad, determine the security threat label difference analysis situation and expectation description common index of the risk behavior big data dyad, And obtain the first binding score between the risk behavior big data 2-tuple corresponding to the security threat label difference analysis and the expected description commonality index, and the security threat label difference analysis reflects the risk behavior big data 2-tuple corresponding Identify whether the security threat labels are consistent, and the expectation description commonality index reflects the degree of difference between the user operation behavior expectation descriptions of the risk behavior big data tuples, and is based on the risk behavior big data used to assist in the positioning of security threat labels. The identified security threat tags and a priori security threat tags are obtained, and the risk behavior big data used to assist in the positioning of the security threat tags corresponds to the second binding score of the identified security threat tags and the prior security threat tags, and then passes The first binding score and the second binding score are used to obtain a classification and recognition result. In this way, by determining the first binding score of the risk behavior big data tuples corresponding to the difference analysis situation and degree of differentiation of security threat labels, it is possible to analyze the difference between the security threat label differences of the identified security threat labels and the expected description commonality index. On the premise of the binding score between the two, from the level of any risk behavior big data tuples, it reflects the accuracy of data information security threat label analysis, and determines the risk behavior big data used to assist in the positioning of security threat labels. The second binding score of the identified security threat label and the prior security threat label can be based on the binding score between the identified security threat label and the prior security threat label. At the same time, through the two levels of risk behavior big data and individual risk behavior big data, the classification and identification results are determined, which can help improve the accuracy of classification and identification results. sex.

在一种可示性实施例中，在安全威胁标签差异分析情况为已识别安全威胁标签一致的基础上，期望描述共性指数与第一绑定评分存在第一设定关系，在安全威胁标签差异分析情况为已识别安全威胁标签不一致的基础上，期望描述共性指数与第一绑定评分存在第二设定关系，且已识别安全威胁标签与先验型安全威胁标签一致条件下的第二绑定评分高于已识别安全威胁标签与先验型安全威胁标签不一致条件下的第二绑定评分。In an illustrative embodiment, on the basis that the security threat label difference analysis is consistent with the identified security threat labels, it is expected that there is a first set relationship between the description commonality index and the first binding score. The analysis is based on the inconsistency of the identified security threat labels, and it is expected to describe the second binding relationship between the commonality index and the first binding score, and the second binding under the condition that the identified security threat labels are consistent with the prior security threat labels. The fixed score is higher than the second binding score under the condition that the identified security threat label is inconsistent with the prior security threat label.

如此设计，在安全威胁标签差异分析情况为已识别安全威胁标签一致的基础上，将期望描述共性指数配置成与第一绑定评分存在第一设定关系，在安全威胁标签差异分析情况为已识别安全威胁标签不一致的基础上，将期望描述共性指数配置成与第一绑定评分存在第二设定关系，从而在安全威胁标签差异分析情况为已识别安全威胁标签一致时，期望描述共性指数越高，与安全威胁标签对比结果的第一绑定评分也越高，期望描述共性指数与安全威胁标签差异分析情况越类似，而在安全威胁标签差异分析情况为已识别安全威胁标签不一致时，期望描述共性指数越高，与安全威胁标签差异分析情况的第一绑定评分越低，即期望描述共性指数与安全威胁标签差异分析情况不类似，这样能够便于在后续分类识别结果的识别流程中，获取到其中两个风险行为大数据之间数据信息安全威胁标签一致的量化评价，进而有助于提高分类识别结果识别的精准度，另外，由于已识别安全威胁标签与先验型安全威胁标签一致条件下的第二绑定评分高于已识别安全威胁标签与先验型安全威胁标签不一致条件下的第二绑定评分，便于在后续分类识别结果的识别流程中，获取到个别风险行为大数据的用户操作行为期望描述的精准度，进而有助于提高分类识别结果识别的精准度。In this way, on the basis that the security threat label difference analysis is consistent with the identified security threat labels, the expectation description commonality index is configured to have a first set relationship with the first binding score, and the security threat label difference analysis is already identified. On the basis of identifying the inconsistency of the security threat labels, configure the expected description commonality index to have a second setting relationship with the first binding score, so that when the security threat label difference analysis is consistent with the identified security threat labels, the expected description commonality index The higher the value, the higher the first binding score of the comparison result with the security threat label. The expected description commonality index is more similar to the security threat label difference analysis. When the security threat label difference analysis is that the identified security threat labels are inconsistent, The higher the expected description commonality index, the lower the first binding score with the security threat label difference analysis situation, that is, the expected description commonality index is not similar to the security threat label difference analysis situation, which can facilitate the identification process of subsequent classification and identification results. , to obtain a quantitative evaluation of the consistency of the data information security threat labels between the two risk behavior big data, which will help to improve the accuracy of classification and identification results. The second binding score under the consistent condition is higher than the second binding score under the condition that the identified security threat label is inconsistent with the prior security threat label, which is convenient for obtaining individual risk behaviors in the identification process of subsequent classification and identification results. The accuracy of the expected description of the user operation behavior of the data, which in turn helps to improve the accuracy of the classification and recognition results.

在一种可示性实施例中，通过完成优化的用户操作行为期望描述，识别风险行为大数据所对应的已识别安全威胁标签，包括：基于朴素贝叶斯分类模型，通过完成优化的用户操作行为期望描述，识别风险行为大数据所对应的已识别安全威胁标签。In an illustrative embodiment, by completing the optimized description of user operation behavior expectations, identifying the identified security threat labels corresponding to the risk behavior big data includes: based on the naive Bayesian classification model, by completing the optimized user operation Behavior expectation description, identifying the identified security threat labels corresponding to the risk behavior big data.

如此设计，通过基于朴素贝叶斯分类模型，通过完成优化的用户操作行为期望描述，识别待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签，能够有助于提高识别的精准度以及工作效率。In this way, based on the naive Bayesian classification model, by completing the optimized description of user operation behavior expectations, the corresponding risk behavior big data to be used for security threat label localization and the risk behavior big data used to assist in security threat label localization are identified. of identified security threat tags, which can help improve the accuracy of identification and work efficiency.

在一种可示性实施例中，通过第一绑定评分和第二绑定评分，得到分类识别结果，包括：基于有向传递算法，通过第一绑定评分和第二绑定评分，得到分类识别结果。In an illustrative embodiment, obtaining the classification and identification result by using the first binding score and the second binding score includes: obtaining the first binding score and the second binding score based on a directed transfer algorithm, obtaining Classification recognition results.

如此设计，基于有向传递算法，通过第一绑定评分和第二绑定评分，得到分类识别结果，能够有效提高分类识别结果的精准性。With this design, based on the directed transfer algorithm, the classification and recognition results are obtained through the first binding score and the second binding score, which can effectively improve the accuracy of the classification and recognition results.

在一种可示性实施例中，指定要求包括：执行基于AI的分类识别操作的累计值小于设定判定值。In an exemplary embodiment, the specifying requirement includes: the accumulated value of performing the AI-based classification and identification operation is less than the set determination value.

如此设计，将指定要求配置成：实施基于AI的分类识别操作的累计值小于设定判定值，能够有助于在数据信息安全威胁标签识别过程中，通过设定判定值累计值的不断重复处理，能够全面获取风险行为大数据之间安全威胁标签关系，这样能够有助于保障数据信息安全威胁标签定位的精度和可信度。In this way, the specified requirements are configured such that the cumulative value of the AI-based classification and identification operation is smaller than the set judgment value, which can help to repeatedly process the accumulated value of the set judgment value in the process of data information security threat tag identification. , which can comprehensively obtain the security threat label relationship between risk behavior big data, which can help to ensure the accuracy and credibility of data information security threat label positioning.

在一种可示性实施例中，通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述的实现方式是通过视觉化AI机器学习模型实施的。In an illustrative embodiment, an implementation manner of optimizing the description of user operation behavior expectations of several risk behavior big data is implemented through a visual AI machine learning model through the security threat tag pairing index.

如此设计，通过通过视觉化AI机器学习模型实施上述通过安全威胁标签配对指数，优化用户操作行为期望描述的步骤，能够有助于提高用户操作行为期望描述优化的时效性。In this way, by implementing the above-mentioned steps of optimizing the user operation behavior expectation description through the security threat tag pairing index through the visual AI machine learning model, it can help to improve the timeliness of the optimization of the user operation behavior expectation description.

在一种可示性实施例中，通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述，包括：通过安全威胁标签配对指数和用户操作行为期望描述，得到邻居用户操作行为期望描述和非邻居用户操作行为期望描述；通过邻居用户操作行为期望描述和非邻居用户操作行为期望描述进行期望描述优化，得到完成优化的用户操作行为期望描述。In an illustrative embodiment, the user operation behavior expectation description of several risk behavior big data is optimized through the security threat tag pairing index, including: obtaining the neighbor user operation behavior through the security threat tag pairing index and the user operation behavior expectation description Expectation description and non-neighbor user operation behavior expectation description; Through the neighbor user operation behavior expectation description and non-neighbor user operation behavior expectation description, the expectation description is optimized, and the optimized user operation behavior expectation description is obtained.

如此设计，通过通过安全威胁标签配对指数和用户操作行为期望描述，得到邻居用户操作行为期望描述和非邻居用户操作行为期望描述，同时通过邻居用户操作行为期望描述和非邻居用户操作行为期望描述两个层面进行期望描述优化，得到完成优化的用户操作行为期望描述，能够提高用户操作行为期望描述优化的精准度。In this way, through the security threat tag pairing index and the expected description of user operation behavior, the expected description of neighbor user operation behavior and the expected description of non-neighbor user operation behavior are obtained. The expectation description is optimized at each level, and the optimized user operation behavior expectation description can be obtained, which can improve the accuracy of the user operation behavior expectation description optimization.

在一种可示性实施例中，针对大数据信息安全的风险行为识别方法还包括：在风险行为大数据二元组指向于同一数据信息安全威胁标签的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第一量化约束；在风险行为大数据二元组指向于不同数据信息安全威胁标签的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第二量化约束；在风险行为大数据二元组中不少于一个为待进行安全威胁标签定位的风险行为大数据的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第二量化约束和第一量化约束之间的设定量化结果。In an illustrative embodiment, the method for identifying risk behaviors for big data information security further includes: on the basis that the big data binary groups of risk behaviors point to the same data information security threat label, classifying the big data binary groups of risk behaviors The original security threat tag pairing index of the group is determined as the first quantitative constraint; on the basis that the risk behavior big data tuples point to different data information security threat tags, the original security threat tag pairing index of the risk behavior big data tuples is set. Determined as the second quantitative constraint; on the basis of at least one risk behavior big data to be located in the risk behavior big data dyad, the original security threat label of the risk behavior big data dyad is paired The exponent is determined as a set quantization result between the second quantization constraint and the first quantization constraint.

如此设计，通过在风险行为大数据二元组指向于同一数据信息安全威胁标签的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第一量化约束，并在风险行为大数据二元组指向于不同数据信息安全威胁标签的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第二量化约束，在风险行为大数据二元组中不少于一个为待进行安全威胁标签定位的风险行为大数据的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第二量化约束和第一量化约束之间的设定量化结果，从而能够通过上述第一量化约束、第二量化约束和设定量化结果，反映风险行为大数据二元组的数据信息安全威胁标签一致的量化评价，这样能够便于后续操作，进而能够保障安全威胁标签配对指数的灵活性及精准性。In this way, on the basis of the risk behavior big data dyad pointing to the same data information security threat label, the original security threat label pairing index of the risk behavior big data dyad is determined as the first quantitative constraint, and the risk behavior big data pair index is determined as the first quantitative constraint. The big data dyad points to the security threat labels of different data information, and the original security threat label pairing index of the risk behavior big data dyad is determined as the second quantitative constraint. On the basis of a risk behavior big data to be located for the security threat label, the original security threat label pairing index of the risk behavior big data binary group is determined as the set quantification between the second quantitative constraint and the first quantitative constraint. As a result, through the first quantitative constraint, the second quantitative constraint and the set quantitative result, it is possible to reflect the consistent quantitative evaluation of the data information security threat label of the big data binary group of risk behaviors, which can facilitate subsequent operations and ensure security. The flexibility and accuracy of the threat tag pairing index.

第二方面是一种风险行为识别系统，包括存储器和处理器；所述存储器和所述处理器耦合；所述存储器用于存储计算机程序代码，所述计算机程序代码包括计算机指令；其中，当所述处理器执行所述计算机指令时，使得所述风险行为识别系统执行第一方面的方法。A second aspect is a risk behavior identification system, including a memory and a processor; the memory is coupled to the processor; the memory is used to store computer program code, the computer program code includes computer instructions; wherein, when all When the processor executes the computer instructions, the risk behavior identification system is caused to perform the method of the first aspect.

附图说明Description of drawings

图1为本发明实施例提供的针对大数据信息安全的风险行为识别方法的流程示意图。FIG. 1 is a schematic flowchart of a method for identifying risk behaviors for big data information security according to an embodiment of the present invention.

图2为本发明实施例提供的针对大数据信息安全的风险行为识别装置的模块框图。FIG. 2 is a block diagram of a module of an apparatus for identifying risky behaviors for big data information security according to an embodiment of the present invention.

具体实施方式Detailed ways

以下，术语“第一”、“第二”和“第三”等仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”或“第三”等的特征可以明示或者隐含地包括一个或者更多个该特征。Hereinafter, the terms "first", "second" and "third" etc. are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature defined as "first", "second" or "third" etc. may expressly or implicitly include one or more of that feature.

图1示出了本发明实施例提供的的针对大数据信息安全的风险行为识别方法的流程示意图，针对大数据信息安全的风险行为识别方法可以通过风险行为识别系统实现，风险行为识别系统可以包括存储器和处理器；所述存储器和所述处理器耦合；所述存储器用于存储计算机程序代码，所述计算机程序代码包括计算机指令；其中，当所述处理器执行所述计算机指令时，使得所述风险行为识别系统执行如下步骤所描述的技术方案。1 shows a schematic flowchart of a method for identifying risk behaviors for big data information security provided by an embodiment of the present invention. The method for identifying risk behaviors for big data information security can be implemented by a risk behavior identification system, and the risk behavior identification system can include: a memory and a processor; the memory and the processor are coupled; the memory is used to store computer program code, the computer program code comprising computer instructions; wherein, when executed by the processor, the computer instructions cause all The risk behavior identification system implements the technical solution described in the following steps.

步骤11、确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数。Step 11: Determine the user operation behavior expectation descriptions of several risk behavior big data and the security threat tag pairing index that is not lower than one risk behavior big data 2-tuple.

在本发明实施例中，若干风险行为大数据包括待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据。在具体实施时，待进行安全威胁标签定位的风险行为大数据为数据信息安全威胁标签没有确定的风险行为大数据，而用于辅助进行安全威胁标签定位的风险行为大数据为数据信息安全威胁标签已经确定的风险行为大数据。比如：用于辅助进行安全威胁标签定位的风险行为大数据可以包括：数据信息安全威胁标签为“信息篡改事件”的风险行为大数据、数据信息安全威胁标签为“信息泄漏事件”的风险行为大数据，待进行安全威胁标签定位的风险行为大数据中涵盖一件潜在分析项目，但上述所指的没有确定其具体是指向于“信息篡改事件”还是“信息泄漏事件”，在此基础上，可以通过本发明实施例中的步骤，识别出其指向于“信息篡改事件”还是“信息泄漏事件”，其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。In this embodiment of the present invention, several risk behavior big data include risk behavior big data to be located for security threat tags and risk behavior big data used to assist in security threat tag localization. In the specific implementation, the big data of risk behaviors to be located in the security threat label is the big data of risk behaviors that are not determined by the data information security threat label, and the big data of risk behaviors used to assist in the localization of security threat labels is the data information security threat label Big data on identified risk behaviors. For example, the risk behavior big data used to assist in the positioning of security threat tags may include: risk behavior big data with the data information security threat tag as "information tampering event", and risk behavior big data with the data information security threat tag as "information leakage incident" Data, the risk behavior big data to be located in the security threat label includes a potential analysis item, but the above-mentioned refers to it is not determined whether it refers to "information tampering incident" or "information leakage incident", on this basis, The steps in this embodiment of the present invention can be used to identify whether it refers to an "information tampering event" or an "information leakage event", and other application situations can be implemented according to a similar idea, which is not described here in this embodiment of the present invention.

举例而言，为了提高挖掘用户操作行为期望描述的质量，可以事先调试一个风险行为大数据识别线程，且该风险行为大数据识别线程包括一个数据挖掘单元，用于挖掘待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据的用户操作行为期望描述。For example, in order to improve the quality of the expected description of the mining user's operation behavior, a risk behavior big data identification thread can be debugged in advance, and the risk behavior big data identification thread includes a data mining unit, which is used to mine the data to be used for security threat tag location. The description of user operation behavior expectations of risk behavior big data and risk behavior big data used to assist in locating security threat tags.

举例而言，待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据通过数据挖掘单元处理后，可以得到设定层面（如，x个层面/x维）的用户操作行为期望描述。在具体实施时，用户操作行为期望描述可以以特征图进行表示。For example, after the risk behavior big data to be used for security threat label localization and the risk behavior big data used to assist in security threat label localization are processed by the data mining unit, a set level (for example, x levels/x dimension) can be obtained. ) of the expected description of user operation behavior. In a specific implementation, the user's operation behavior expectation description can be represented by a feature map.

本发明实施例中，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组（可以理解为风险行为大数据对）。比如：若干风险行为大数据包含用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data1、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data2和待进行安全威胁标签定位的风险行为大数据target_big_data1，则风险行为大数据二元组可以包括：用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data1和待进行安全威胁标签定位的风险行为大数据target_big_data1、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data2和待进行安全威胁标签定位的风险行为大数据target_big_data1、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data1和待进行安全威胁标签定位的风险行为大数据target_big_data1，其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。In the embodiment of the present invention, every two risk behavior big data in several risk behavior big data is regarded as a risk behavior big data binary group (which can be understood as a risk behavior big data pair). For example, several risk behavior big data include risk behavior big data auxiliary_big_data1 used to assist in the localization of security threat labels, risk behavior big data auxiliary_big_data2 used to assist in the localization of security threat labels, and risk behavior big data target_big_data1 to be used for security threat label localization , the risk behavior big data 2-tuple may include: risk behavior big data auxiliary_big_data1 used to assist in the localization of security threat labels, risk behavior big data target_big_data1 to be used for security threat label localization, risk used to assist in the localization of security threat labels Behavior big data auxiliary_big_data2 and risk behavior big data target_big_data1 to be used for security threat label localization, risk behavior big data auxiliary_big_data1 used to assist in security threat label localization, and risk behavior big data target_big_data1 to be used for security threat label localization, other applications can be It is implemented according to a similar idea, and this embodiment of the present invention will not be described too much here.

举例而言，风险行为大数据二元组指向于同一数据信息安全威胁标签量化评价的安全威胁标签配对指数示例性可以涵盖：风险行为大数据二元组指向于同一数据信息安全威胁标签的目标标签定位置信度。比如：当目标标签定位置信度为0.45时，可以确定风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价较高；或者，当目标标签定位置信度为0.05时，可以确定风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价较低；或者，当目标标签定位置信度为0.25时，可以确定风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价和指向于不同数据信息安全威胁标签的量化评价相同。For example, the security threat tag pairing index for the quantitative evaluation of the risk behavior big data two-tuple points to the same data information security threat tag can be exemplarily covered: the risk behavior big data two-tuple points to the target label of the same data information security threat tag Location reliability. For example: when the location reliability of the target tag is 0.45, it can be determined that the risk behavior big data two-tuple points to the same data information security threat tag with a higher quantitative evaluation; or, when the location reliability of the target tag is 0.05, it can be determined that the risk The quantitative evaluation of behavior big data dyads pointing to the same data information security threat label is low; or, when the location reliability of the target label is 0.25, it can be determined that the risk behavior big data dyads point to the same data information security threat label. The quantitative evaluation is the same as the quantitative evaluation pointing to different data information security threat labels.

举例而言，在最初实施本发明实施例中的步骤时，可以原始化风险行为大数据二元组指向于同一数据信息安全威胁标签的安全威胁标签配对指数。在具体实施时，在风险行为大数据二元组指向于同一数据信息安全威胁标签的基础上，可以将风险行为大数据二元组原始的安全威胁标签配对指数确定为第一量化约束，比如：当通过上述目标标签定位置信度反映安全威胁标签配对指数时，可以将第一量化约束配置成1；另外，在风险行为大数据二元组指向于不同数据信息安全威胁标签的基础上，将风险行为大数据二元组原始的安全威胁标签配对指数确定为第二量化约束，比如：当通过上述目标标签定位置信度表示安全威胁标签配对指数时，可以将第二量化约束配置成0；另外，由于待进行安全威胁标签定位的风险行为大数据为待识别的风险行为大数据，由此，在风险行为大数据二元组中不少于一个为待进行安全威胁标签定位的风险行为大数据时，风险行为大数据二元组指向于同一数据信息安全威胁标签的安全威胁标签配对指数难以精准定位，为了提高原始化安全威胁标签配对指数的抗干扰性，可以将安全威胁标签配对指数确定为第二量化约束和第一量化约束之间的设定量化结果，比如：当通过上述目标标签定位置信度表示安全威胁标签配对指数时，可以将设定量化结果配置成0.25，也可以基于实际情况配置成 0.2、0.3、0.35，在此不做限定。For example, when the steps in the embodiments of the present invention are initially implemented, the risk behavior big data two-tuple may be originalized to point to the security threat tag pairing index of the same data information security threat tag. In specific implementation, on the basis that the risk behavior big data two-tuple points to the same data information security threat tag, the original security threat tag pairing index of the risk behavior big data two-tuple can be determined as the first quantitative constraint, for example: When the security threat tag pairing index is reflected by the above-mentioned target tag location reliability, the first quantitative constraint can be configured as 1; The original security threat tag pairing index of the behavioral big data two-tuple is determined as the second quantitative constraint. For example, when the security threat tag pairing index is represented by the above-mentioned target tag location reliability, the second quantitative constraint can be configured as 0; in addition, Since the risk behavior big data to be identified by the security threat label is the risk behavior big data to be identified, there is no less than one risk behavior big data to be identified in the risk behavior big data binary group. , it is difficult to accurately locate the security threat tag pairing index of the risk behavior big data two-tuple pointing to the same data information security threat tag. In order to improve the anti-interference of the original security threat tag pairing index, the security threat tag pairing index can be determined as the first The set quantification result between the binary quantification constraint and the first quantification constraint, for example: when the security threat tag pairing index is represented by the above-mentioned target tag location reliability, the set quantification result can be configured to 0.25, or it can be configured based on the actual situation 0.2, 0.3, 0.35, which is not limited here.

举例而言，结合上述内容，可以共有U种数据信息安全威胁标签的用于辅助进行安全威胁标签定位的风险行为大数据，且每种数据信息安全威胁标签对应有X组用于辅助进行安全威胁标签定位的风险行为大数据，U为不小于1的整数，X为不小于1 的整数，本发明针对大数据信息安全的风险行为识别方法实施例可以用于绑定有数据信息安全威胁标签的用于辅助进行安全威胁标签定位的风险行为大数据相对重要的应用环境，比如：支付风险行为大数据差异定位、在线办公风险行为大数据差异定位等等。For example, in combination with the above content, there can be a total of U types of data information security threat tags that are used to assist in the positioning of security threat tags. The risk behavior big data, and each data information security threat tag corresponds to X groups for assisting in the security threat. Tag-located risk behavior big data, U is an integer not less than 1, X is an integer not less than 1, the embodiment of the risk behavior identification method for big data information security of the present invention can be used for data information security threat tags bound The application environment of risk behavior big data used to assist in the positioning of security threat labels is relatively important, such as: payment risk behavior big data differential positioning, online office risk behavior big data differential positioning, etc.

步骤12、通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述。Step 12: Optimize the description of user operation behavior expectations of several risk behavior big data through the security threat tag pairing index.

举例而言，为了提高优化用户操作行为期望描述的效率，可以实现调试一个风险行为大数据识别线程，且该风险行为大数据识别线程还进一步包括视觉化AI机器学习模型（LSTM），实际调试流程可以参阅本发明公开的风险行为大数据识别线程的调试方法实施例中的相关步骤，在此不作过多描述。For example, in order to improve the efficiency of optimizing the description of user operation behavior expectations, it is possible to debug a risk behavior big data identification thread, and the risk behavior big data identification thread further includes a visual AI machine learning model (LSTM), the actual debugging process. Reference may be made to the relevant steps in the embodiments of the method for debugging the risk behavior big data identification thread disclosed in the present invention, which will not be described too much here.

举例而言，为了提高用户操作行为期望描述的精准度，可以通过安全威胁标签配对指数和用户操作行为期望描述，得到邻居用户操作行为期望描述和非邻居用户操作行为期望描述，其中，邻居用户操作行为期望描述为通过安全威胁标签配对指数将用户操作行为期望描述进行邻居用户操作行为期望描述分类所得到的用户操作行为期望描述，而非邻居用户操作行为期望描述为通过安全威胁标签配对指数将用户操作行为期望描述进行非邻居用户操作行为期望描述分类所得到的用户操作行为期望描述。在得到邻居用户操作行为期望描述和非邻居用户操作行为期望描述之后，可以通过邻居用户操作行为期望描述和非邻居用户操作行为期望描述进行期望描述优化，得到完成优化的用户操作行为期望描述。在具体实施时，可以将邻居用户操作行为期望描述和非邻居用户操作行为期望描述进行组合，得到组合后的用户操作行为期望描述，并通过相关算法（非线性转换）将组合后的用户操作行为期望描述进行调整，以得到完成优化的用户操作行为期望描述。For example, in order to improve the accuracy of the expected description of user operation behavior, the expected description of neighbor user operation behavior and the expected description of non-neighbor user operation behavior can be obtained through the security threat tag pairing index and the expected description of user operation behavior. The behavior expectation description is the user operation behavior expectation description obtained by classifying the user operation behavior expectation description through the security threat tag pairing index, and the user operation behavior expectation description obtained by classifying the neighbor user operation behavior expectation description. Operation behavior expectation description The user operation behavior expectation description obtained by classifying the non-neighbor user operation behavior expectation description. After obtaining the expected description of the operation behavior of the neighbor users and the expected description of the operation behavior of the non-neighbor user, the expected description can be optimized by the expected description of the operation behavior of the neighbor user and the expected description of the operation behavior of the non-neighbor user, and the optimized user operation behavior expectation description can be obtained. In the specific implementation, the expected description of the operation behavior of the neighbor users and the expected description of the operation behavior of the non-neighbor users can be combined to obtain the expected description of the combined user operation behavior, and the combined user operation behavior can be converted through a related algorithm (non-linear transformation). The expectation description is adjusted to obtain the optimized user operation behavior expectation description.

步骤13、通过完成优化的用户操作行为期望描述，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。Step 13 , by completing the optimized description of user operation behavior expectations, obtain the security threat label location situation of the risk behavior big data to be performed security threat label location.

在本申请实施例中，安全威胁标签定位情况可以旨在反映待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签。In the embodiment of the present application, the security threat label location situation may be intended to reflect the data information security threat label corresponding to the risk behavior big data to be performed security threat label location.

举例而言，在得到完成优化的用户操作行为期望描述之后，可以通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果，且分类识别结果包括待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度（可以理解为概率），从而可以基于第一标签定位置信度，得到安全威胁标签定位情况。在具体实施时，先验型安全威胁标签（参考类别）可以理解为用于辅助进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签。比如：若干风险行为大数据包含用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data1、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data2和待进行安全威胁标签定位的风险行为大数据target_big_data1，用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data1所对应的数据信息安全威胁标签为“信息泄漏事件”、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data2所对应的数据信息安全威胁标签为“信息篡改事件”，则不少于一个先验型安全威胁标签包括：“信息泄漏事件”、“信息篡改事件”；或者，若干风险行为大数据包含用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data11、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data12、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data13、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data14和待进行安全威胁标签定位的风险行为大数据target_big_data1，用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data11所对应的数据信息安全威胁标签为“隐私信息非法爬取”、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data12所对应的数据信息安全威胁标签为“数字资产盗取”、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data13所对应的数据信息安全威胁标签为“DDOS攻击”、用于辅助进行安全威胁标签定位的风险行为大数据auxiliary_big_data14所对应的数据信息安全威胁标签为“网络卡顿攻击”，则不少于一个先验型安全威胁标签包括：“隐私信息非法爬取”、“数字资产盗取”、“DDOS攻击”、“网络卡顿攻击”。其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。For example, after the optimized user operation behavior expectation description is obtained, the AI-based classification and identification operation can be performed by completing the optimized user operation behavior expectation description, and the classification identification result is obtained, and the classification identification result includes the security threat label positioning to be performed. The big data of risk behavior points to the first tag location reliability (which can be understood as probability) of no less than one a priori security threat tag, so that the security threat tag location information can be obtained based on the first tag location reliability. In specific implementation, the a priori security threat label (reference category) can be understood as the data information security threat label corresponding to the risk behavior big data used to assist in locating the security threat label. For example, several risk behavior big data include risk behavior big data auxiliary_big_data1 used to assist in the localization of security threat labels, risk behavior big data auxiliary_big_data2 used to assist in the localization of security threat labels, and risk behavior big data target_big_data1 to be used for security threat label localization , the risk behavior big data auxiliary_big_data1 used to assist in the localization of security threat tags corresponds to the data information security threat tag "information leakage event", and the risk behavior big data auxiliary_big_data2 used to assist in the localization of security threat tags corresponds to the data information security If the threat label is "information tampering event", then no less than one a priori security threat label includes: "information leakage event", "information tampering event"; or, a number of risk behavior big data are included to assist in locating security threat labels risk behavior big data auxiliary_big_data11, risk behavior big data auxiliary_big_data12, risk behavior big data auxiliary_big_data13, risk behavior big data used to assist security threat tag localization auxiliary_big_data14 and the risk behavior big data target_big_data1 to be used for security threat label localization, the risk behavior big data auxiliary_big_data11 corresponding to the data information security threat label is "illegal crawling of privacy information", used to assist in the The data information security threat label corresponding to the risk behavior big data auxiliary_big_data12 is "digital asset theft", and the risk behavior big data auxiliary_big_data13 used to assist in the security threat label localization corresponds to the data information security threat label "" DDOS attack”, the risk behavior big data auxiliary_big_data14 used to assist in locating security threat labels, and the data information security threat label corresponding to “network freeze attack”, then no less than one a priori security threat label includes: “Privacy Information "Illegal Crawl", "Digital Asset Stealing", "DDOS Attack", "Network Caton Attack". Other application situations can be implemented according to a similar idea, which is not described here in this embodiment of the present invention.

举例而言，为了提高识别效率，可以事先调试一个风险行为大数据识别线程，且风险行为大数据识别线程包括朴素贝叶斯分类模型，实际调试流程可以参阅本发明风险行为大数据识别线程的调试方法实施例中的相关描述，在此不作过多描述。在此基础上，可以基于朴素贝叶斯分类模型，通过完成优化的用户操作行为期望描述，识别得到待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度。For example, in order to improve the identification efficiency, a risk behavior big data identification thread can be debugged in advance, and the risk behavior big data identification thread includes a naive Bayesian classification model. For the actual debugging process, please refer to the debugging of the risk behavior big data identification thread in the present invention. The relevant descriptions in the method embodiments will not be described too much here. On this basis, based on the naive Bayesian classification model, by completing the optimized description of user operation behavior expectations, it is possible to identify and obtain the risk behavior big data to be located for security threat labels, pointing to at least one a priori security threat label. The location reliability of the first label.

举例而言，可以直接将以上包含第一标签定位置信度的分类识别结果，作为待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。比如：在实际应用时，可以将待进行安全威胁标签定位的风险行为大数据分别指向于“实时信息泄漏事件”、“延时信息篡改事件”、“实时信息泄漏事件”、“延时信息泄漏事件”的第一标签定位置信度，作为该待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况；其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。For example, the above classification identification result including the location reliability of the first tag can be directly used as the security threat tag location situation of the risk behavior big data to be performed security threat tag location. For example, in practical application, the big data of risk behaviors to be located by security threat tags can be pointed to "real-time information leakage event", "delayed information tampering event", "real-time information leakage event", "delayed information leakage event" The location reliability of the first tag of the event” is used as the security threat tag positioning situation of the risk behavior big data to be performed security threat tag positioning; other application situations can be implemented according to similar ideas, and this embodiment of the present invention does not go too much here. describe.

举例而言，还可以基于待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度，确定待进行安全威胁标签定位的风险行为大数据的数据信息安全威胁标签，并将确定得到的数据信息安全威胁标签，作为待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。在具体实施时，可以将最高的第一标签定位置信度所对应的先验型安全威胁标签，作为待进行安全威胁标签定位的风险行为大数据的数据信息安全威胁标签。比如：在实际应用时，识别得到待进行安全威胁标签定位的风险行为大数据分别指向于“实时信息泄漏事件”、“延时信息篡改事件”、“实时信息泄漏事件”、“延时信息泄漏事件”的第一标签定位置信度为：0.05、0.35、0.05、0.05，则可以将“延时信息篡改事件”作为该待进行安全威胁标签定位的风险行为大数据的数据信息安全威胁标签；其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。For example, it can also be determined that the risk behavior of the security threat tag to be located is large based on the location reliability of the first tag that is not less than one a priori security threat tag based on the big data of the risk behavior to be located. The data information security threat label of the data is determined, and the obtained data information security threat label is determined as the security threat label location of the risk behavior big data to be located. In specific implementation, the a priori security threat label corresponding to the highest first label location reliability may be used as the data information security threat label of the risk behavior big data to be performed security threat label location. For example, in practical application, the big data of risk behaviors identified to be located in the security threat tag are respectively directed to "real-time information leakage event", "delayed information tampering event", "real-time information leakage event", "delayed information leakage event" If the location reliability of the first label of “event” is: 0.05, 0.35, 0.05, 0.05, then the “delayed information tampering event” can be used as the data information security threat label of the risk behavior big data to be located for the security threat label; other The application situation can be implemented according to a similar idea, and the embodiment of the present invention will not be described too much here.

举例而言，通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，可以得到分类识别结果，且分类识别结果包含待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，则在实施基于AI的分类识别操作的累计值符合指定要求的基础上，可以通过分类识别结果，优化若干风险行为大数据的安全威胁标签配对指数，并再次实施上述步骤12以及之后的操作，通过安全威胁标签配对指数优化用户操作行为期望描述，并通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作的步骤，直到实施基于AI的分类识别操作的累计值不符合指定要求为止。如此设计，能够在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，来优化表示风险行为大数据二元组的安全威胁标签配对指数，从而提高安全威胁标签差异化程度的抗干扰性，同时不断通过完成优化的安全威胁标签差异化程度，对用户操作行为期望描述进行优化，从而又提高用户操作行为期望描述的抗干扰性，进而能够使得安全威胁标签差异化程度和用户操作行为期望描述呈现互补关系，能够有助于进一步提高数据信息安全威胁标签定位的精度和可信度。举例而言，指定要求示例性可以涵盖：实施基于AI的分类识别操作的累计值小于设定判定值。设定判定值至少为1。For example, by completing the optimized user operation behavior expectation description and performing AI-based classification and identification operations, the classification and identification results can be obtained, and the classification and identification results include the risk behavior big data to be located for security threat label positioning. Point to no less than one type. The location reliability of the first label of the a priori security threat label and the risk behavior big data used to assist in the location of the security threat label point to not less than one type of a priori security threat label. The location reliability of the second label, then in On the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements, the security threat tag pairing index of the big data of several risk behaviors can be optimized through the classification and identification results, and the above step 12 and subsequent operations can be implemented again. The tag pairing index optimizes the expected description of user operation behavior, and performs the steps of AI-based classification and identification operation by completing the optimized user operation behavior expectation description, until the cumulative value of the implementation of the AI-based classification and identification operation does not meet the specified requirements. In this way, on the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements, the risk behavior big data to be located in the security threat tag can point to the first security threat tag of no less than one a priori type. The label location reliability and the risk behavior big data used to assist in the positioning of security threat labels point to the location reliability of the second label with no less than one a priori security threat label, so as to optimize the representation of risk behavior big data two-tuples. The security threat tag pairing index improves the anti-interference of the security threat tag differentiation degree, and at the same time, continuously optimizes the user operation behavior expectation description by completing the optimized security threat tag differentiation degree, thereby improving the user operation behavior expectation description. The anti-interference ability can make the degree of differentiation of security threat tags and the description of user operation behavior expectations present a complementary relationship, which can help to further improve the accuracy and credibility of data information security threat tag positioning. For example, the specified requirement may exemplarily cover that the accumulated value of the AI-based classification and recognition operation is implemented to be less than the set determination value. Set the judgment value to at least 1.

举例而言，在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，可以基于第一标签定位置信度，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。示例性可以参考上述记录，在此不作过多描述。另外，通过分类识别结果优化安全威胁标签配对指数的具体过程，可以参见如下实施例所描述的内容，在此不作过多描述。For example, on the basis that the accumulated value of the AI-based classification and identification operation does not meet the specified requirements, the security threat tag positioning situation of the risk behavior big data to be used for security threat tag positioning can be obtained based on the location reliability of the first tag. . For example, reference may be made to the above records, which will not be described here. In addition, for the specific process of optimizing the security threat tag pairing index based on the classification and identification results, reference may be made to the content described in the following embodiments, which will not be described too much here.

如此设计，若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数，且若干风险行为大数据涵盖用于辅助进行安全威胁标签定位的风险行为大数据和待进行安全威胁标签定位的风险行为大数据，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组，安全威胁标签配对指数反映风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价，并通过安全威胁标签配对指数，优化用户操作行为期望描述，从而通过完成优化的用户操作行为期望描述，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。因此通过通过安全威胁标签配对指数，优化用户操作行为期望描述，能够使同一数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较小的状态下，并尽量确保存在差异的数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较大的状态下，这样能够有助于保障用户操作行为期望描述的抗干扰性，同时有助于获取用户操作行为期望描述的整体性定位描述，这样可以有助于保障数据信息安全威胁标签定位的精度和可信度。In this way, the expected description of user operation behavior of some risk behavior big data and the security threat tag pairing index of no less than one risk behavior big data binary group, and some risk behavior big data cover the risk used to assist in the positioning of security threat tags Big data of behavior and big data of risk behavior to be positioned by security threat label, every two big data of risk behavior in several big data of risk behavior is regarded as a big data tuple of risk behavior, and the pairing index of security threat label reflects the big data of risk behavior two The tuple points to the quantitative evaluation of the security threat tag of the same data information, and optimizes the user operation behavior expectation description through the security threat tag pairing index, so as to obtain the risk behavior to be located by the security threat tag positioning by completing the optimized user operation behavior expectation description The location of security threat tags of big data. Therefore, by optimizing the user operation behavior expectation description through the security threat tag pairing index, the user operation behavior expectation description corresponding to the risk behavior big data of the same data information security threat tag can be in a state with a small degree of feature discrimination, and try to ensure that there are Differences in data, information, security, threat tags, risk behavior, big data, and the expected description of user operation behavior are in a state with a high degree of feature discrimination, which can help to ensure the anti-interference of the expected description of user operation behavior, and at the same time help to obtain The overall positioning description of the user's operation behavior expectation description, which can help to ensure the accuracy and credibility of the data information security threat tag positioning.

对于另一种可能实施的实施例而言，本发明针对大数据信息安全的风险行为识别方法具体还可以包括如下步骤：For another possible implementation, the method for identifying risk behaviors for big data information security of the present invention may further include the following steps:

步骤21、确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数。Step 21: Determine the user operation behavior expectation descriptions of several risk behavior big data and the security threat tag pairing index that is not lower than one risk behavior big data 2-tuple.

本发明实施例中，若干风险行为大数据涵盖用于辅助进行安全威胁标签定位的风险行为大数据和待进行安全威胁标签定位的风险行为大数据，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组，安全威胁标签配对指数表示风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价。示例性可以参见上述所描述的内容，在此不作过多描述。In the embodiment of the present invention, a number of risk behavior big data include risk behavior big data used to assist in locating security threat tags and risk behavior big data to be security threat tag localization, and every two risk behavior big data in the risk behavior big data Data is a big data tuples of risk behavior, and the pairing index of security threat tags indicates the quantitative evaluation of the big data tuples of risk behaviors pointing to the same data information security threat tags. For example, reference can be made to the content described above, which will not be described too much here.

步骤22：通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述。示例性可以参见上述所描述的内容，在此不作过多描述。Step 22: Optimize the user operation behavior expectation description of several risk behavior big data through the security threat tag pairing index. For example, reference can be made to the content described above, which will not be described too much here.

步骤23、通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果。Step 23: Perform an AI-based classification and identification operation by completing the optimized user operation behavior expectation description to obtain a classification and identification result.

本发明实施例中，分类识别结果包括待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度。先验型安全威胁标签是用于辅助进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签，示例性可以参见上述实施例中所描述的内容，在此不作过多描述。In the embodiment of the present invention, the classification and identification results include the location reliability of the first label that points to no less than one a priori security threat label, and the location reliability of the first label used to assist in the location of the security threat label. The risk behavior big data points to the location reliability of the second tag with no less than one a priori security threat tag. The a priori security threat tag is a data information security threat tag corresponding to the risk behavior big data used to assist in locating the security threat tag. For an example, refer to the content described in the above embodiment, which will not be described here.

在具体实施时，可以通过完成优化的用户操作行为期望描述，识别待进行安全威胁标签定位的风险行为大数据和用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签，且已识别安全威胁标签指向于不少于一个先验型安全威胁标签。以支付数据信息安全威胁标签定位为例，不少于一个先验型安全威胁标签包括：“数字资产盗取”、“DDOS攻击”、“网络卡顿攻击”时，已识别安全威胁标签为“数字资产盗取”、“DDOS攻击”、“网络卡顿攻击”中的其中一个，其它应用情况可以根据类似的思路实施，本发明实施例在此不做过多描述。During specific implementation, the optimized description of user operation behavior expectations can be completed to identify the identified security threat labels corresponding to the big data of risk behaviors to be localized for security threat labels and the big data of risk behaviors used to assist in the localization of security threat labels , and the identified security threat tag points to at least one a priori security threat tag. Taking payment data information security threat label positioning as an example, no less than one a priori security threat label includes: "digital asset theft", "DDOS attack", "network stuck attack", the identified security threat label is "" One of "digital asset theft", "DDOS attack", and "network freeze attack", and other application situations can be implemented according to a similar idea, and this embodiment of the present invention will not be described too much here.

在得到已识别安全威胁标签之后，对于各个风险行为大数据二元组，可以确定风险行为大数据二元组的安全威胁标签差异分析情况和期望描述共性指数，并获得风险行为大数据二元组关于安全威胁标签差异分析情况和期望描述共性指数之间的第一绑定评分，且安全威胁标签差异分析情况反映风险行为大数据二元组所对应的已识别安全威胁标签是否一致，期望描述共性指数反映风险行为大数据二元组的用户操作行为期望描述之间的差异化程度，以及基于用于辅助进行安全威胁标签定位的风险行为大数据所对应的已识别安全威胁标签和先验型安全威胁标签，得到用于辅助进行安全威胁标签定位的风险行为大数据关于已识别安全威胁标签与先验型安全威胁标签的第二绑定评分，从而可以通过第一绑定评分和第二绑定评分，得到分类识别结果。After the identified security threat labels are obtained, for each big data tuples of risk behaviors, the security threat label difference analysis situation and expectation description common index of the big data tuples of risk behaviors can be determined, and the big data tuples of risk behaviors can be obtained. Regarding the first binding score between the security threat label difference analysis situation and the expected description commonality index, and the security threat label difference analysis situation reflects whether the identified security threat labels corresponding to the risk behavior big data binary group are consistent, the expected description commonality The index reflects the degree of differentiation between the expected descriptions of user operation behaviors of the big data of risk behavior, and the identified security threat labels and a priori security based on the risk behavior big data used to assist in the localization of security threat labels. Threat label, obtain the second binding score of the identified security threat label and the a priori security threat label for the risk behavior big data used to assist the positioning of the security threat label, so that the first binding score and the second binding can be obtained through the first binding score and the second binding score Score, get the classification and recognition results.

如此，通过确定风险行为大数据二元组关于安全威胁标签差异分析情况和差异化程度的第一绑定评分，能够在已识别安全威胁标签的安全威胁标签差异分析情况以及期望描述共性指数之间的绑定评分的前提下，从任一风险行为大数据二元组的层面，反映数据信息安全威胁标签分析的精准性，并通过确定用于辅助进行安全威胁标签定位的风险行为大数据关于已识别安全威胁标签与先验型安全威胁标签的第二绑定评分，能够在已识别安全威胁标签与先验型安全威胁标签之间的绑定评分的前提下，从个别风险行为大数据的层面，反映数据信息安全威胁标签分析的精准性，同时通过其中两个风险行为大数据和个别风险行为大数据两个层面，来得到分类识别结果，能够有助于提高分类识别结果识别的精准性。In this way, by determining the first binding score of the risk behavior big data two-tuple on the difference analysis of security threat labels and the degree of differentiation, it is possible to analyze the differences between the security threat labels of the identified security threat labels and the expected description commonality index. On the premise of the binding score, from the level of any risk behavior big data tuples, it reflects the accuracy of data information security threat label analysis, and determines the risk behavior big data used to assist in the positioning of security threat labels. The second binding score between the identified security threat label and the a priori security threat label can be based on the premise of the binding score between the identified security threat label and the prior security threat label, from the level of individual risk behavior big data , which reflects the accuracy of data information security threat label analysis. At the same time, through two levels of risk behavior big data and individual risk behavior big data, the classification and identification results can be obtained, which can help to improve the accuracy of classification and identification results.

举例而言，为了提高识别精度，示例性可以基于朴素贝叶斯分类模型，通过完成优化的用户操作行为期望描述，识别风险行为大数据所对应的已识别安全威胁标签。For example, in order to improve the recognition accuracy, an example can be based on the naive Bayesian classification model, by completing the optimized description of user operation behavior expectations, to identify the identified security threat labels corresponding to the risk behavior big data.

举例而言，在安全威胁标签差异分析情况为已识别安全威胁标签一致的基础上，期望描述共性指数与第一绑定评分存在第一设定关系，即期望描述共性指数越大，第一绑定评分越大，安全威胁标签差异分析情况与期望描述共性指数越类似，反之，期望描述共性指数越小，第一绑定评分越小，安全威胁标签差异分析情况与期望描述共性指数不类似；而在安全威胁标签差异分析情况为已识别安全威胁标签不一致的基础上，期望描述共性指数与第一绑定评分存在第二设定关系，即期望描述共性指数越大，第一绑定评分越小，安全威胁标签差异分析情况与期望描述共性指数不类似，反之，期望描述共性指数越小，第一绑定评分越大，安全威胁标签差异分析情况与期望描述共性指数越类似。这样能够便于在后续分类识别结果的识别流程中，获取到风险行为大数据二元组之间数据信息安全威胁标签一致的量化评价，进而有助于提高分类识别结果识别的精准度。For example, on the basis that the security threat label difference analysis is consistent with the identified security threat labels, there is a first set relationship between the expected description commonality index and the first binding score, that is, the greater the expected description commonality index, the first binding score. The larger the fixed score, the more similar the security threat label difference analysis is to the expected description commonality index. On the contrary, the smaller the expected description commonality index is, the smaller the first binding score is, and the security threat label difference analysis is not similar to the expected description commonality index; On the basis that the security threat label difference analysis is inconsistent with the identified security threat labels, there is a second setting relationship between the expected description commonality index and the first binding score, that is, the greater the expected description commonality index, the higher the first binding score. Small, the security threat label difference analysis is not similar to the expected description commonality index. On the contrary, the smaller the expected description commonality index is, the greater the first binding score is, and the security threat label difference analysis is similar to the expected description commonality index. In this way, it is convenient to obtain a quantitative evaluation of the consistency of the data information security threat labels between the risk behavior big data two-tuples in the subsequent identification process of the classification and identification results, thereby helping to improve the accuracy of the identification of the classification and identification results.

举例而言，已识别安全威胁标签与先验型安全威胁标签一致时，用于辅助进行安全威胁标签定位的风险行为大数据之间的第二绑定评分，大于已识别安全威胁标签与先验型安全威胁标签不一致时，用于辅助进行安全威胁标签定位的风险行为大数据之间的第二绑定评分。如此设计，便于在后续分类识别结果的识别流程中，获取到个别风险行为大数据的用户操作行为期望描述的精准度，进而有助于提高分类识别结果识别的精准度。For example, when the identified security threat label is consistent with the a priori security threat label, the second binding score between the risk behavior big data used to assist in locating the security threat label is greater than the identified security threat label and the prioritized security threat label. When the security threat labels are inconsistent, the second binding score between the risk behavior big data used to assist in the location of security threat labels. This design facilitates obtaining the accuracy of the expected description of the user's operation behavior of the individual risk behavior big data in the subsequent identification process of the classification and identification results, thereby helping to improve the accuracy of the identification of the classification and identification results.

步骤24：判断实施基于AI的分类识别操作的累计值是否符合指定要求，若是，实施步骤25，若否，实施步骤27。Step 24: Determine whether the accumulated value of the AI-based classification and recognition operation meets the specified requirements, if yes, go to Step 25, if not, go to Step 27.

在具体实施时，指定要求可以包括：实施基于AI的分类识别操作的累计值小于设定判定值。设定判定值至少为1。During specific implementation, the specified requirement may include: the accumulated value of the AI-based classification and identification operation is less than the set judgment value. Set the judgment value to at least 1.

步骤25：通过分类识别结果，优化安全威胁标签配对指数。Step 25: Optimize the security threat tag pairing index according to the classification and identification results.

本发明实施例中，安全威胁标签配对指数示例性可以涵盖：每组风险行为大数据二元组指向于同一数据信息安全威胁标签的目标标签定位置信度。In the embodiment of the present invention, the security threat tag pairing index may exemplarily include: each group of risk behavior big data two-tuple points to the location reliability of the target tag of the same data information security threat tag.

在此基础上，可以依次以若干风险行为大数据中每个风险行为大数据作为当前风险行为大数据，并将包含当前风险行为大数据的风险行为大数据二元组作为当前风险行为大数据二元组，在第1轮基于AI的分类识别操作过程中，可以通过第一标签定位置信度和第二标签定位置信度，依次确定每组当前风险行为大数据二元组指向于同一数据信息安全威胁标签的先验型标签定位置信度。此外，可以确定当前风险行为大数据的全部当前风险行为大数据二元组的目标标签定位置信度的全局计算结果，作为当前风险行为大数据的全局量化指标。在得到先验型标签定位置信度、全局量化指标之后，可以对于各个当前风险行为大数据二元组，分别通过全局量化指标、先验型标签定位置信度，变更每组风险行为大数据二元组的目标标签定位置信度。在具体实施时，可以将风险行为大数据二元组的目标标签定位置信度，作为统计值，并通过该统计值对前一轮基于AI的分类识别操作所得到的风险行为大数据二元组的先验型标签定位置信度进行全局处理（如，基于权重的平均化处理），并通过全局处理结果和先验型标签定位置信度，对目标标签定位置信度进行优化，得到第1轮基于AI的分类识别操作过程中完成优化的目标标签定位置信度。On this basis, each risk behavior big data in several risk behavior big data can be taken as the current risk behavior big data in turn, and the risk behavior big data 2-tuple containing the current risk behavior big data can be used as the current risk behavior big data 2 Tuples, in the first round of AI-based classification and identification operations, the location reliability of the first tag and the location reliability of the second tag can be used to sequentially determine that each group of current risk behavior big data two-tuples point to the same data information security. A-priori label location reliability for threat labels. In addition, the global calculation result of the location reliability of the target tags of all the current risk behavior big data two-tuples of the current risk behavior big data can be determined as the global quantitative index of the current risk behavior big data. After obtaining the location reliability and global quantitative index of a priori label, for each current risk behavior big data binary group, the global quantitative index and the location reliability of a priori label can be used to change each group of risk behavior big data binary. Group target label localization reliability. In the specific implementation, the target label of the risk behavior big data 2-tuple can be determined as the location reliability as a statistical value, and the risk behavior big data 2-tuple obtained by the previous round of AI-based classification and identification operations can be compared with the statistical value. The location reliability of the a priori label is processed globally (for example, weight-based averaging), and through the global processing results and the location reliability of the prior label, the location reliability of the target label is optimized, and the first round is based on The positioning reliability of the target label is optimized during the classification and recognition operation of AI.

步骤26、再次实施步骤22。Step 26: Step 22 is performed again.

在得到完成优化的安全威胁标签配对指数之后，可以再次实施上述步骤22以及之后的操作，通过完成优化的安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述。After obtaining the optimized security threat tag pairing index, the above step 22 and subsequent operations can be performed again. By completing the optimized security threat tag pairing index, the user operation behavior expectation description of several risk behavior big data is optimized.

这样一来，可以使得用户操作行为期望描述以及安全威胁标签配对指数呈现互补关系，共同提高各自的抗干扰性，从而在多轮重复处理之后，可以获取到更加全面及精准的整体性定位描述，进而有助于提高数据信息安全威胁标签定位的精度和可信度。In this way, the expected description of the user's operation behavior and the pairing index of the security threat tag can be in a complementary relationship to jointly improve their anti-interference performance, so that after multiple rounds of repeated processing, a more comprehensive and accurate overall positioning description can be obtained. In turn, it helps to improve the accuracy and credibility of data information security threat tag location.

步骤27、基于第一标签定位置信度，得到安全威胁标签定位情况。Step 27: Obtain the location information of the security threat label based on the location reliability of the first label.

举例而言，在安全威胁标签定位情况包含待进行安全威胁标签定位的风险行为大数据的数据信息安全威胁标签的基础上，可以最大的第一标签定位置信度所对应的先验型安全威胁标签，作为待进行安全威胁标签定位的风险行为大数据的数据信息安全威胁标签。For example, on the basis that the security threat label location situation includes the data information security threat label of the risk behavior big data to be performed security threat label location, the a priori security threat label corresponding to the maximum location reliability of the first label can be located. , as the data information security threat tag of the risk behavior big data to be located for security threat tag positioning.

与上述技术方案相异，通过将分类识别结果配置成还涵盖用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，并在基于第一标签定位置信度，得到安全威胁标签定位情况之前，进一步在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过分类识别结果，优化安全威胁标签配对指数，且再次实施通过安全威胁标签配对指数，优化用户操作行为期望描述的步骤，以及在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，基于第一标签定位置信度，得到安全威胁标签定位情况。Different from the above technical solutions, by configuring the classification and identification results to also cover the risk behavior big data used to assist in the positioning of security threat tags, the location reliability of the second tag pointing to not less than one a priori security threat tag, And before obtaining the security threat tag location based on the location reliability of the first tag, further optimize the security threat tag pairing index through the classification and identification results on the basis that the cumulative value of the AI-based classification and identification operation meets the specified requirements, and Re-implement the steps of optimizing the description of user operation behavior expectations through the security threat tag pairing index, and on the basis that the accumulated value of the AI-based classification and identification operation does not meet the specified requirements, based on the location reliability of the first tag, obtain the security threat tag. Positioning situation.

如此能够在实施基于AI的分类识别操作的累计值符合指定要求的基础上，通过待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度，对安全威胁标签配对指数进行优化，从而提高安全威胁标签差异化程度的抗干扰性，同时不断通过完成优化的安全威胁标签差异化程度，对用户操作行为期望描述进行优化，从而又提高用户操作行为期望描述的抗干扰性，进而能够使得安全威胁标签差异化程度和用户操作行为期望描述呈现互补关系，并在实施基于AI的分类识别操作的累计值不符合指定要求的基础上，基于第一标签定位置信度，得到安全威胁标签定位情况，这样能够有助于提高数据信息安全威胁标签定位的精度和可信度。In this way, on the basis that the accumulated value of the AI-based classification and identification operation meets the specified requirements, the risk behavior big data to be located in the security threat tag can be directed to the first tag location of no less than one a priori security threat tag. Confidence and risk behavior big data used to assist in the positioning of security threat tags point to a second tag location confidence of no less than one a priori security threat tag, and optimize the security threat tag pairing index to improve security threats The anti-interference ability of the label differentiation degree, at the same time, by continuously completing the optimized security threat label differentiation degree, the user operation behavior expectation description is optimized, thereby improving the anti-interference performance of the user operation behavior expectation description, which can make the security threat label The degree of differentiation and the expected description of user operation behavior show a complementary relationship, and on the basis that the accumulated value of the implementation of AI-based classification and identification operations does not meet the specified requirements, based on the location reliability of the first tag, the security threat tag location information can be obtained. It is helpful to improve the accuracy and credibility of data information security threat tag location.

对于另一种可能实施的实施例而言，本发明实施例中，风险行为大数据识别示例性可以通过风险行为大数据识别线程实施的，且风险行为大数据识别线程包括不少于一个（如， V个）顺序组合的识别单元，每个识别单元包括一个第一识别子单元（如，CNN）和一个第二识别子单元（如，resnet），则本发明实施例示例性可以涵盖如下步骤。For another possible implementation example, in this embodiment of the present invention, the identification of risk behavior big data can be implemented through a risk behavior big data identification thread, and the risk behavior big data identification thread includes no less than one (such as , V) sequentially combined identification units, each identification unit includes a first identification subunit (eg, CNN) and a second identification subunit (eg, resnet), the embodiment of the present invention can cover the following steps exemplarily .

步骤31、确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数。Step 31: Determine the user operation behavior expectation descriptions of several risk behavior big data and the security threat tag pairing index that is not lower than one risk behavior big data two-tuple.

本发明实施例中，若干风险行为大数据涵盖用于辅助进行安全威胁标签定位的风险行为大数据和待进行安全威胁标签定位的风险行为大数据，若干风险行为大数据中每两个风险行为大数据作为一个风险行为大数据二元组，安全威胁标签配对指数表示风险行为大数据二元组指向于同一数据信息安全威胁标签的量化评价。示例性可以参见上述实施例中所描述的内容，在此不作过多描述。In the embodiment of the present invention, a number of risk behavior big data include risk behavior big data used to assist in locating security threat tags and risk behavior big data to be security threat tag localization, and every two risk behavior big data in the risk behavior big data Data is a big data tuples of risk behavior, and the pairing index of security threat tags indicates the quantitative evaluation of the big data tuples of risk behaviors pointing to the same data information security threat tags. For example, reference may be made to the content described in the above-mentioned embodiments, which will not be described too much here.

步骤32、基于第V个识别单元的第一识别子单元，通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述。Step 32: Based on the first identification sub-unit of the V-th identification unit, through the security threat tag pairing index, optimize the user operation behavior expectation description of several risk behavior big data.

步骤33、基于第V个识别单元的第二识别子单元，通过完成优化的用户操作行为期望描述进行基于AI的分类识别操作，得到分类识别结果。Step 33: Based on the second identification sub-unit of the V-th identification unit, perform an AI-based classification and identification operation by completing the optimized user operation behavior expectation description to obtain a classification and identification result.

本发明实施例中，分类识别结果包括待进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第一标签定位置信度和用于辅助进行安全威胁标签定位的风险行为大数据指向于不少于一种先验型安全威胁标签的第二标签定位置信度。In the embodiment of the present invention, the classification and identification results include the location reliability of the first label that points to no less than one a priori security threat label, and the location reliability of the first label used to assist in the location of the security threat label. The risk behavior big data points to the location reliability of the second tag with no less than one a priori security threat tag.

步骤34、判断执行基于AI的分类识别操作的是否为风险行为大数据识别线程的最后一个识别单元，若否则跳转到步骤35，若是则跳转到步骤37。Step 34: Determine whether the AI-based classification and identification operation is the last identification unit of the risk behavior big data identification thread, if not, go to step 35, and if so, go to step 37.

在具体实施时，当风险行为大数据识别线程包括V个识别单元时，可以判断l是否低于V，若是，则表明还存在识别单元未执行上述用户操作行为期望描述优化以及分类识别结果识别的步骤，则可以继续实施如下步骤35，以通过后续识别单元不断优化用户操作行为期望描述并识别分类识别结果，若否，则表明风险行为大数据识别线程的全部识别单元皆已全部实施上述用户操作行为期望描述优化以及分类识别结果识别的步骤，则可以实施如下步骤37，基于分类识别结果中的第一标签定位置信度，得到安全威胁标签定位情况。In specific implementation, when the risk behavior big data identification thread includes V identification units, it can be judged whether l is lower than V, and if so, it indicates that there are still identification units that do not perform the above-mentioned optimization of user operation behavior expectation description and classification and identification result identification. Step, you can continue to implement the following step 35, to continuously optimize the user operation behavior expectation description and identify the classification identification result through the subsequent identification unit, if not, it indicates that all identification units of the risk behavior big data identification thread have all implemented the above-mentioned user operation. For the steps of behavior expectation description optimization and classification and identification result identification, the following step 37 may be implemented to obtain the security threat label location information based on the location reliability of the first label in the classification and identification result.

步骤35、通过分类识别结果，优化安全威胁标签配对指数，并进行自加一处理。Step 35: Optimize the security threat tag pairing index by classifying and identifying the result, and perform self-addition processing.

步骤36、再次实施步骤32以及之后的操作。Step 36: Step 32 and subsequent operations are performed again.

步骤37、基于第一标签定位置信度，得到安全威胁标签定位情况。示例性可以参见上述实施例中所描述的内容，在此不作过多描述。Step 37 , obtaining the location information of the security threat label based on the location reliability of the first label. For example, reference may be made to the content described in the above-mentioned embodiments, which will not be described too much here.

与上述技术方案相异，在实施基于AI的分类识别操作的并非最后一个识别单元情况下，通过分类识别结果，优化安全威胁标签配对指数，且重新通过下一识别单元实施通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述的步骤。如此，能够提高安全威胁标签差异化程度的抗干扰性，同时不断通过完成优化的安全威胁标签差异化程度，对用户操作行为期望描述进行优化，从而又提高用户操作行为期望描述的抗干扰性，进而能够使得安全威胁标签差异化程度和用户操作行为期望描述呈现互补关系，能够有助于进一步提高数据信息安全威胁标签定位的精度和可信度。Different from the above technical solutions, in the case where the AI-based classification and identification operation is not the last identification unit, the security threat tag pairing index is optimized through the classification and identification results, and the security threat tag pairing index is re-implemented through the next identification unit. , and optimize the steps described in the user operation behavior expectations of several risk behavior big data. In this way, the anti-interference ability of the differentiated degree of security threat labels can be improved, and at the same time, by continuously completing the optimized degree of differentiation of security threat labels, the expected description of the user's operation behavior is optimized, thereby improving the anti-interference of the expected description of the user's operation behavior. Furthermore, the differentiation degree of the security threat label and the user's operation behavior expectation description can present a complementary relationship, which can help to further improve the accuracy and credibility of the data information security threat label location.

在一些可选的实施例中，在得到所述待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况之后，该方法还可以包括以下内容：通过所述待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签确定与所述待进行安全威胁标签定位的风险行为大数据对应的潜在风险描述；根据所述潜在风险描述生成对应的大数据防护策略。In some optional embodiments, after obtaining the security threat label location situation of the risk behavior big data to be performed security threat label location, the method may further include the following content: The data information security threat label corresponding to the risk behavior big data determines a potential risk description corresponding to the risk behavior big data to be located for the security threat label; and generates a corresponding big data protection strategy according to the potential risk description.

举例而言，潜在风险描述可以是根据待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签所衍生得到的可能存在的风险情况，基于此，能够预先通过潜在风险描述制定对应的大数据防护策略，从而实现针对性且前瞻性的风险防护处理。For example, the potential risk description may be a possible risk situation derived from the data information security threat label corresponding to the risk behavior big data to be located in the security threat label. Based on this, the corresponding potential risk description can be formulated in advance. It can realize targeted and proactive risk protection processing.

在一些可选的实施例中，通过所述待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签确定与所述待进行安全威胁标签定位的风险行为大数据对应的潜在风险描述，可以包括以下内容：基于所述待进行安全威胁标签定位的风险行为大数据所对应的数据信息安全威胁标签衍生得到待进行筛选的待定风险描述集；对所述待定风险描述集中的多个风险描述向量依次实施个体入侵解析和群体入侵解析，得到个体入侵解析信息集和群体入侵解析信息集；通过第一指定误差处理指示，对所述个体入侵解析信息集进行第一误差校正处理，得到包括有个体入侵行为的第一风险描述子集；通过第二指定误差处理指示，对所述群体入侵解析信息集进行第二误差校正处理，得到包括有群体入侵行为的第二风险描述子集；基于所述第一风险描述子集和所述第二风险描述子集进行加权操作，得到所述待定风险描述集中与指定行为相匹配的目标风险描述集；所述指定行为包括个体入侵行为和群体入侵行为中的至少一种；通过所述目标风险描述集从所述待定风险描述集中筛选得到潜在风险描述。In some optional embodiments, the potential risk corresponding to the big data of risk behavior to be located for security threat tag location is determined by using the data information security threat tag corresponding to the big data of risk behavior to be located. The description may include the following content: a set of pending risk descriptions to be screened is derived based on the data information security threat labels corresponding to the risk behavior big data to be located for the security threat label; The risk description vector executes individual intrusion analysis and group intrusion analysis in sequence, and obtains an individual intrusion analysis information set and a group intrusion analysis information set; through the first designated error processing instruction, the individual intrusion analysis information set is subjected to the first error correction processing to obtain A first risk description subset including individual intrusion behaviors; performing a second error correction process on the group intrusion analysis information set through a second specified error processing instruction to obtain a second risk description subset including group intrusion behaviors; A weighted operation is performed based on the first risk description subset and the second risk description subset to obtain a target risk description set matching the specified behavior in the pending risk description set; the specified behavior includes individual intrusion behaviors and groups At least one of intrusion behaviors; a potential risk description is obtained by screening from the undetermined risk description set through the target risk description set.

举例而言，可以通过计算所述目标风险描述集与所述待定风险描述集中每个风险描述向量的相关性并进行求和处理以选取和值最高的风险描述向量作为潜在风险描述，这样能够考虑不同类型的入侵行为，从而保障潜在风险描述的准确性和可靠性。For example, the correlation between the target risk description set and each risk description vector in the pending risk description set can be calculated and summed to select the risk description vector with the highest sum value as the potential risk description. Different types of intrusion behaviors, so as to ensure the accuracy and reliability of potential risk descriptions.

基于同样的发明构思，图2示出了本发明实施例提供的针对大数据信息安全的风险行为识别装置的模块框图，针对大数据信息安全的风险行为识别装置可以包括实施图1所示的相关方法步骤的如下模块。Based on the same inventive concept, FIG. 2 shows a block diagram of a module of a risk behavior identification device for big data information security provided by an embodiment of the present invention. The risk behavior identification device for big data information security may include implementing the relevant information shown in FIG. The following modules of method steps.

指数确定模块21，用于确定若干风险行为大数据的用户操作行为期望描述以及不低于一个风险行为大数据二元组的安全威胁标签配对指数。The index determination module 21 is used to determine the user operation behavior expectation description of several risk behavior big data and the security threat tag pairing index that is not lower than one risk behavior big data binary group.

数据优化模块22，用于通过安全威胁标签配对指数，优化若干风险行为大数据的用户操作行为期望描述。The data optimization module 22 is used to optimize the user operation behavior expectation description of several risk behavior big data through the security threat tag pairing index.

标签定位模块23，用于通过完成优化的用户操作行为期望描述，得到待进行安全威胁标签定位的风险行为大数据的安全威胁标签定位情况。The label location module 23 is configured to obtain the security threat label location situation of the risk behavior big data to be performed security threat label location by completing the optimized user operation behavior expectation description.

应用于本发明的相关实施例可以达到如下技术效果：通过通过安全威胁标签配对指数，优化用户操作行为期望描述，能够使同一数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较小的状态下，并尽量确保存在差异的数据信息安全威胁标签的风险行为大数据对应的用户操作行为期望描述处于特征区分度较大的状态下，这样能够有助于保障用户操作行为期望描述的抗干扰性，同时有助于获取用户操作行为期望描述的整体性定位描述（比如分布情况），这样可以有助于保障数据信息安全威胁标签定位的精度和可信度。The relevant embodiments applied to the present invention can achieve the following technical effects: by optimizing the user operation behavior expectation description through the security threat tag pairing index, the user operation behavior expectation description corresponding to the risk behavior big data of the same data information security threat tag can be in the same position. In a state with a small feature discrimination, try to ensure that there are differences in data, information, security, threat tags, risk behaviors, and the expected description of user operation behavior corresponding to big data. In a state with a large feature discrimination, this can help ensure user operations The anti-interference of the behavior expectation description is helpful to obtain the overall positioning description (such as the distribution) of the user operation behavior expectation description, which can help to ensure the accuracy and credibility of the data information security threat tag location.

以上所述，仅为本申请的具体实施方式。熟悉本技术领域的技术人员根据本申请提供的具体实施方式，可想到变化或替换，都应涵盖在本申请的保护范围之内。The above descriptions are merely specific embodiments of the present application. Those skilled in the art can think of changes or substitutions based on the specific embodiments provided by the present application, which should all fall within the protection scope of the present application.

Claims

1. A risk behavior identification method for big data information security is applied to a risk behavior identification system, and the method at least comprises the following steps:

determining expected descriptions of user operation behaviors of a plurality of risk behavior big data and a security threat label pairing index not lower than one risk behavior big data binary; the plurality of risk behavior big data cover risk behavior big data used for assisting in positioning the security threat tag and risk behavior big data to be positioned by the security threat tag, every two risk behavior big data in the plurality of risk behavior big data are used as one risk behavior big data binary, and the security threat tag pairing index reflects quantitative evaluation that the risk behavior big data binary points to the same data information security threat tag;

optimizing the user operation behavior expectation description of the risk behavior big data through the security threat label pairing index; obtaining the security threat tag positioning condition of the risk behavior big data to be subjected to security threat tag positioning through the optimized user operation behavior expectation description; the positioning condition of the security threat tag aims to reflect the data information security threat tag corresponding to the risk behavior big data to be subjected to security threat tag positioning.

2. The method according to claim 1, wherein the obtaining of the security threat tag localization situation of the risk behavior big data to be security threat tag localized through the optimized user operation behavior expectation description comprises:

performing AI-based classification and identification operation through the optimized user operation behavior expectation description to obtain a classification and identification result, wherein the classification and identification result contains that the risk behavior big data to be subjected to security threat label positioning points to a first label positioning confidence coefficient of at least one prior type security threat label, and the prior type security threat label is a data information security threat label corresponding to the risk behavior big data for assisting in positioning the security threat label;

and combining the first label positioning confidence coefficient to obtain the safety threat label positioning condition.

3. The method of claim 2, wherein the classification recognition result further comprises a second tag localization confidence level that the risk behavior big data for assisting in localization of the security threat tag points to the at least one prior security threat tag;

before said obtaining said security threat tag localization situation in combination with said first tag localization confidence level, said method further comprises: optimizing the security threat tag matching index through the classification recognition result on the basis that the accumulated value of the AI-based classification recognition operation meets the specified requirement, and optimizing the user operation behavior expectation description of the risk behavior big data through the security threat tag matching index again, wherein the specified requirement comprises the following steps: an accumulated value for performing the AI-based classification recognition operation is less than a set determination value; wherein the implementation of optimizing the user operation behavior expectation description of the plurality of risk behavior big data through the security threat tag pairing index is implemented through a visual AI machine learning model;

combining the first tag positioning confidence to obtain the security threat tag positioning condition, including: and obtaining the positioning condition of the security threat tag by combining the first tag positioning confidence coefficient on the basis that the accumulated value for implementing the AI-based classification recognition operation does not meet the specified requirement.

4. The method of claim 3, wherein the security threat tag pairing index comprises: each risk behavior big data binary group points to the position credibility of a target label of the same data information security threat label; optimizing the security threat tag pairing index according to the classification recognition result, comprising:

sequentially taking each risk behavior big data in the risk behavior big data as current risk behavior big data, and taking the risk behavior big data binary group carrying the current risk behavior big data as a current risk behavior big data binary group;

determining a global calculation result of the target label positioning confidence degrees of all the current risk behavior big data binary groups of the current risk behavior big data as a global quantitative index of the current risk behavior big data;

sequentially determining prior type tag positioning confidence coefficients of the big data binary groups of the current risk behaviors pointing to the same data information security threat tag according to the first tag positioning confidence coefficient and the second tag positioning confidence coefficient;

and changing the target label positioning confidence coefficient of each current risk behavior big data binary group respectively through the global quantization index and the prior label positioning confidence coefficient.

5. The method according to claim 4, wherein performing the AI-based classification and identification operation by completing the optimized description of the behavior expectation of the user operation results in a classification and identification result, comprising:

identifying an identified security threat tag corresponding to the risk behavior big data through the optimized user operation behavior expectation description, wherein the identified security threat tag points to at least one prior type security threat tag;

for each risk behavior big data binary group, determining a security threat tag difference analysis condition and an expected description commonality index of the risk behavior big data binary group, and obtaining a first binding score between the risk behavior big data binary group and the expected description commonality index corresponding to the security threat tag difference analysis condition; the safety threat label difference analysis condition reflects whether the identified safety threat labels corresponding to the risk behavior big data binary group are consistent or not, and the expectation description commonality index reflects the differentiation degree between the user operation behavior expectation descriptions of the risk behavior big data binary group;

obtaining a second binding score between the identified security threat tag and the prior security threat tag corresponding to the risk behavior big data for assisting in positioning the security threat tag by combining the identified security threat tag and the prior security threat tag corresponding to the risk behavior big data for assisting in positioning the security threat tag;

obtaining the classification recognition result through the first binding score and the second binding score; wherein obtaining the classification recognition result according to the first binding score and the second binding score includes: and obtaining the classification recognition result through the first binding score and the second binding score based on a directed transfer algorithm.

6. The method of claim 5, wherein the expected description commonality index has a first predetermined relationship with the first binding score based on the security threat tag variance analysis being agreement with the identified security threat tag, wherein the expected description commonality index has a second predetermined relationship with the first binding score based on the security threat tag variance analysis being inconsistency with the identified security threat tag, and wherein the second binding score is higher for the identified security threat tag with the prior security threat tag than for the identified security threat tag with the prior security threat tag.

7. The method of claim 5, wherein identifying the identified security threat tag to which the risky behavior big data corresponds by completing the optimized user operational behavior expectation description comprises: and identifying the identified security threat label corresponding to the risk behavior big data through the optimized user operation behavior expectation description based on a naive Bayesian classification model.

8. The method according to claim 1, wherein optimizing the user operational behavior expectation description of the plurality of risk behavior big data through the security threat tag pairing index comprises:

obtaining a neighbor user operation behavior expectation description and a non-neighbor user operation behavior expectation description through the security threat tag pairing index and the user operation behavior expectation description;

and performing expectation description optimization through the neighbor user operation behavior expectation description and the non-neighbor user operation behavior expectation description to obtain the optimized user operation behavior expectation description.

9. The method of claim 1, further comprising:

determining an original security threat tag pairing index of the big risk behavior data binary group as a first quantitative constraint on the basis that the big risk behavior data binary group points to the same data information security threat tag;

determining the original security threat tag pairing index of the risk behavior big data binary group as a second quantitative constraint on the basis that the risk behavior big data binary group points to different data information security threat tags;

and determining the original security threat tag pairing index of the risk behavior big data binary group as a set quantization result between the second quantization constraint and the first quantization constraint on the basis that at least one risk behavior big data binary group is the risk behavior big data to be subjected to security threat tag positioning.

10. A risk-behavior recognition system, comprising: a memory and a processor; the memory and the processor are coupled; the memory for storing computer program code, the computer program code comprising computer instructions; wherein the computer instructions, when executed by the processor, cause the risk behavior identification system to perform the method of any of claims 1-9.