CN116522403B

CN116522403B - Interactive information desensitization method and server for focusing big data privacy security

Info

Publication number: CN116522403B
Application number: CN202310812630.8A
Authority: CN
Inventors: 田凯; 张惠元; 宋园园
Original assignee: Big Bear Big Data Technology Changshu Co ltd
Current assignee: Big Bear Big Data Technology Changshu Co ltd
Priority date: 2023-07-04
Filing date: 2023-07-04
Publication date: 2023-08-29
Anticipated expiration: 2043-07-04
Also published as: CN116522403A

Abstract

According to the method and the server for desensitizing the focused big data privacy security interaction information, in each privacy semantic interaction operation process, the feature interaction variables can be flexibly and real-timely determined according to the loaded plurality of frequent description items of the movable texts, and the plurality of frequent description items of the movable texts are interactively operated by using the flexibly and real-timely determined feature interaction variables, so that the privacy semantic interaction quality of a target big data desensitization decision network is improved, various digital service user movable texts to be subjected to interaction information desensitization processing are treated, the information desensitization decision analysis precision and pertinence of the big data desensitization decision network are remarkably improved, and the information desensitization decision labels can be ensured to accurately and effectively guide privacy protection processing of the digital service user movable texts. Therefore, the technical problem that the privacy protection processing of the digital service user activity text is difficult to accurately realize in the traditional technology can be solved.

Description

Interactive information desensitization method and server for focusing big data privacy security

Technical Field

The invention relates to the technical field of big data, in particular to an interaction information desensitizing method and a server for focusing big data privacy security.

Background

Data/information desensitization refers to the deformation of data of certain sensitive information through a desensitization rule, so that the reliable protection of sensitive privacy data is realized. This allows for the safe use of the desensitized real data set in development, testing and other non-production environments and outsourcing environments. In the big data age, the privacy security problem of data cannot be ignored, and a large amount of privacy data informations involved in the interaction process of various digital services (such as electronic commerce, government enterprise services and intelligent offices) are protected. Based on this, privacy protection and information desensitization processing for digital business records are of paramount importance.

Disclosure of Invention

The invention at least provides an interaction information desensitizing method and a server for focusing big data privacy security.

The invention provides an interaction information desensitizing method for focusing big data privacy security, which is applied to an artificial intelligent server and comprises the following steps:

invoking a target big data desensitization decision network for completing debugging, decomposing a digital service user activity text to be subjected to interaction information desensitization processing into a plurality of user activity text sets, and mining the frequent description items of the activity texts of the user activity text sets;

Invoking the target big data desensitization decision network, and performing privacy semantic interaction operation by using a repeated feedback rule according to a plurality of active text frequent description items to obtain a plurality of corresponding target privacy semantic interaction frequent items; wherein, each privacy semantic interaction operation comprises: determining a characteristic interaction variable according to a plurality of active text frequent description items loaded at this time, outputting a plurality of privacy semantic interaction frequent items to be processed according to the characteristic interaction variable and the plurality of active text frequent description items, and obtaining a plurality of derived active text frequent description items according to the plurality of privacy semantic interaction frequent items to be processed;

and calling the target big data desensitization decision network, and carrying out information desensitization decision analysis on the digital service user activity text according to the plurality of target privacy semantic interaction frequent items to obtain an information desensitization decision tag of the digital service user activity text.

In some examples, the generating the feature interaction variable from the number of active text frequent description items includes:

the plurality of the frequent description items of the active text are disassembled into at least one frequent description item set of the active text, and the frequent description items of the active text with the same number exist in different frequent description item sets of the active text;

And respectively implementing the following processing for the least one active text frequent description item set:

generating an interactive frequent item list according to an active text frequent description item set;

and taking the interaction frequent item list corresponding to each of the at least one active text frequent description item set as the characteristic interaction variable.

In some examples, each active text frequent description item corresponds to a linear array comprising a plurality of characterization dimensions, and the generating the interactive frequent item list according to one active text frequent description item set includes:

performing feature reduction operation on each active text frequent description item in one active text frequent description item set respectively to obtain a plurality of corresponding first reduced linear arrays;

and combining the first reduced linear arrays into a first spliced linear array, and changing the first spliced linear array into the interactive frequent item list.

for each of the frequent description items of the one frequent description item of the active text, the following processing is implemented respectively:

Disassembling a plurality of characterization dimensions of one active text frequent description item into a plurality of description intervals, and obtaining interval frequent items corresponding to the description intervals; wherein each description interval corresponds to at least one characterization dimension, and the same number of characterization dimensions exist in different description intervals;

for a plurality of interval frequent items corresponding to each of the frequent description items of the active text, generating a corresponding local interaction frequent item list according to each interval frequent item corresponding to the same description interval, and obtaining a local interaction frequent item list corresponding to each of a plurality of description intervals;

and taking the obtained multiple local interaction frequent item lists as the interaction frequent item list.

In some examples, the generating a corresponding local interaction frequent item list according to each interval frequent item corresponding to the same description interval, to obtain local interaction frequent item lists corresponding to a plurality of description intervals, includes:

for a plurality of description intervals, the following processes are respectively implemented:

performing characteristic reduction operation on each interval frequent item corresponding to one description interval to obtain a plurality of corresponding second reduced linear arrays;

and combining the plurality of second reduced linear arrays into a second spliced linear array, and changing the second spliced linear array into the local interaction frequent item list.

In some examples, the outputting the plurality of privacy semantic interaction frequent items to be processed according to the feature interaction variable and the plurality of activity text frequent description items includes:

combining the feature interaction variable and the plurality of active text frequent description items to obtain a plurality of alternative privacy semantic interaction frequent items;

according to the plurality of alternative privacy semantic interaction frequent items, the following processing is implemented by using a repeated feedback rule:

generating a derived feature interaction variable according to the plurality of alternative privacy semantic interaction frequent items loaded at the time, and obtaining a plurality of derived alternative privacy semantic interaction frequent items through the derived feature interaction variable and the plurality of alternative privacy semantic interaction frequent items;

and taking the plurality of alternative privacy semantic interaction frequent items output in the last round as the plurality of privacy semantic interaction frequent items to be processed.

In some examples, the combining the feature interaction variable and the plurality of active text frequent description items to obtain a plurality of alternative privacy semantic interaction frequent items includes:

obtaining a plurality of corresponding privacy semantic relation frequent items through the at least one active text frequent description item set and the respective interaction frequent item list;

Outputting the plurality of candidate privacy semantic interaction frequent items through a semantic integration processing layer according to the plurality of privacy semantic relationship frequent items.

In some examples, the list of interaction frequent items includes a number of lists of local interaction frequent items, each of the set of active text frequent description items including: the method comprises the steps that a plurality of description intervals correspond to interval frequent items respectively, and each description interval corresponds to a local interaction frequent item list;

the obtaining a plurality of privacy semantic relation frequent items according to the at least one active text frequent description item set and the respective interaction frequent item list includes: and respectively implementing the following processing for the least one active text frequent description item set: according to a set of frequent description items of the active text, each frequent description item of the active text corresponds to a plurality of frequent interval items, multiplication processing is carried out on each frequent interval item corresponding to the same description interval and a corresponding local interaction frequent item list, and a plurality of frequent interval privacy semantic interaction items corresponding to a plurality of description intervals are obtained; aiming at a plurality of description interval privacy semantic interaction frequent items corresponding to a plurality of description intervals, aggregating a plurality of description interval privacy semantic interaction frequent items corresponding to the same activity text frequent description item to obtain a plurality of activity text aggregation frequent item sets corresponding to the one activity text frequent description item set; and acquiring the privacy semantic relation frequent items through a plurality of active text aggregation frequent item sets corresponding to the at least one active text frequent description item set respectively.

In some examples, the breaking down the number of active text frequent description items into at least one active text frequent description item set includes:

the plurality of movable text frequent description items are disassembled into a plurality of transverse movable text frequent description item sets according to rows, and are disassembled into a plurality of longitudinal movable text frequent description item sets according to columns, so that at least one movable text frequent description item set is obtained;

the plurality of active text frequent description items are adjusted to be in a grid list form, and the first constraint value is consistent with the second constraint value;

the obtaining the plurality of privacy semantic relation frequent items through the plurality of active text aggregation frequent item sets corresponding to the at least one active text frequent description item set respectively includes:

acquiring a transverse active text aggregation frequent item set through a plurality of active text aggregation frequent item sets corresponding to the plurality of transverse active text frequent description item sets respectively, and acquiring a longitudinal active text aggregation frequent item set through a plurality of active text aggregation frequent item sets corresponding to the plurality of longitudinal active text frequent description item sets respectively;

and obtaining the privacy semantic relation frequent items according to the horizontal active text aggregation frequent item set and the vertical active text aggregation frequent item set.

In some examples, the obtaining the derived plurality of frequent description items of the activity text according to the plurality of frequent items of the privacy semantic interaction to be processed includes:

and carrying out feature reversible processing on the plurality of privacy semantic interaction frequent items to be processed to obtain the plurality of derived activity text frequent description items.

In some examples, the invoking the target big data desensitization decision network performs information desensitization decision analysis on the digital service user activity text according to the plurality of target privacy semantic interaction frequent items to obtain an information desensitization decision tag of the digital service user activity text, including:

invoking the target big data desensitization decision network, and performing feature optimization on the plurality of target privacy semantic interaction frequent items to obtain privacy semantic optimization frequent items;

and calling the target big data desensitization decision network, and carrying out information desensitization decision analysis on the digital service user activity text according to the privacy semantic optimization frequent item to obtain an information desensitization decision tag of the digital service user activity text.

The invention also provides an artificial intelligence server, which comprises a processor and a memory; the processor is in communication with the memory, and the processor is configured to read and execute a computer program from the memory to implement the method described above.

The present invention also provides a computer readable storage medium having stored thereon a computer program which, when run, implements the method described above.

The technical scheme provided by the embodiment of the invention can comprise the following beneficial effects: when a target big data desensitization decision network is called to carry out information desensitization decision analysis on the digital service user activity text to be subjected to interactive information desensitization processing, after the frequent description items of the activity texts of a plurality of user activity text sets of the digital service user activity text to be subjected to interactive information desensitization processing are mined, repeated feedback rules are utilized to carry out multiple privacy semantic interaction operations on the plurality of frequent description items of the activity texts so as to obtain a plurality of corresponding target privacy semantic interaction frequent items, and finally, information desensitization decision analysis is carried out on the digital service user activity text to be subjected to interactive information desensitization processing according to the plurality of target privacy semantic interaction frequent items so as to obtain an information desensitization decision tag of the digital service user activity text to be subjected to interactive information desensitization processing. In the privacy semantic interaction operation process, the feature interaction variable can be flexibly and real-timely determined according to the loaded frequent description items of the plurality of active texts, and the frequent description items of the plurality of active texts are interacted by using the flexibly and real-timely determined feature interaction variable, so that the privacy semantic interaction quality of a target big data desensitization decision network is improved, various digital service user active texts to be subjected to interaction information desensitization processing are dealt with, the information desensitization decision analysis precision and pertinence of the big data desensitization decision network are remarkably improved, and the information desensitization decision label is ensured to accurately and effectively guide the privacy protection processing of the digital service user active texts. Therefore, the technical problem that the privacy protection processing of the digital service user activity text is difficult to accurately realize in the traditional technology can be solved.

For a description of the effects of the artificial intelligence server, computer readable storage medium described above, see the description of the method described above.

In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are necessary for the embodiments to be used are briefly described below, the drawings being incorporated in and forming a part of the description, these drawings showing embodiments according to the present invention and together with the description serve to illustrate the technical solutions of the present invention. It is to be understood that the following drawings illustrate only certain embodiments of the invention and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.

FIG. 1 is a block diagram of an artificial intelligence server, according to an embodiment of the invention.

Fig. 2 is a flow chart of an interactive information desensitizing method for focusing big data privacy security according to an embodiment of the invention.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention.

Fig. 1 is a schematic diagram of an artificial intelligence server 10 according to an embodiment of the present invention, including a processor 102, a memory 104, and a bus 106. The memory 104 is used for storing execution instructions, including a memory and an external memory, where the memory may also be understood as an internal memory, and is used for temporarily storing operation data in the processor 102 and data exchanged with the external memory such as a hard disk, where the processor 102 exchanges data with the external memory through the memory, and when the artificial intelligence server 10 operates, the processor 102 and the memory 104 communicate with each other through the bus 106, so that the processor 102 executes the interaction information desensitizing method of focusing big data privacy security in the embodiment of the present invention.

Referring to fig. 2, fig. 2 is a flowchart of a method for desensitizing interaction information with security focusing on big data privacy, which is applied to an artificial intelligence server and can include steps 201-203.

Step 201, invoking a target big data desensitization decision network for completing debugging, decomposing the digital service user activity text to be subjected to the interaction information desensitization processing into a plurality of user activity text sets, and mining the frequent description items of the activity text of each of the plurality of user activity text sets.

In the embodiment of the invention, the target big data desensitization decision network can be formed according to a residual error model, for example: the residual model is an RNN or other type of machine learning model. The target big data desensitization decision network can be obtained through debugging of a digital service user activity learning text set, and a true desensitization decision label is annotated in each digital service user activity learning text (digital service user activity text sample) in the digital service user activity learning text set. The network cost index in the debugging process is a desensitization decision offset, for example, cross Entropy Loss can be adopted.

In the embodiment of the invention, the target big data desensitization decision network can decompose the digital service user activity text to be subjected to the interactive information desensitization processing into a plurality of user activity text sets, and the exemplary method can divide the user activity text sets based on text fine granularity, such as: the higher the text granularity of the digital service user active text to be subjected to the interaction information desensitization processing is, the more the user active text set is obtained by disassembly, and the lower the text granularity of the digital service user active text to be subjected to the interaction information desensitization processing is, the less the user active text set is obtained by disassembly. Further, the target big data desensitization decision network may mine the active text frequent description items of each user's active text set. Wherein the active text frequent description items of the user active text set are used for characterizing text characteristics of the user active text set, and further can characterize some columns of business activities (such as page browsing activities, information searching activities, conversation chat activities, etc.) of the user.

Step 202, a target big data desensitization decision network is called, privacy semantic interaction operation is carried out according to a plurality of active text frequent description items by using a repeated feedback rule, and a plurality of corresponding target privacy semantic interaction frequent items are obtained.

Wherein, each privacy semantic interaction operation comprises: according to the loaded frequent description items of the plurality of active texts, determining a characteristic interaction variable, outputting a plurality of frequent description items of the privacy semantics to be processed according to the characteristic interaction variable and the frequent description items of the plurality of active texts, and obtaining a plurality of derived frequent description items of the active texts according to the plurality of frequent description items of the privacy semantics to be processed.

For step 202, the target big data desensitization decision network may frequently describe items for several active texts loaded, based on rules (such as iterations) of repeated feedback, multiple privacy semantic interactions. In the process of each privacy semantic interaction operation, a feature interaction variable (parameter guidance of feature mixing) is determined according to the loaded plurality of active text frequent description items, the plurality of active text frequent description items are subjected to interaction operation (such as feature cross and feature mixing processing) according to the feature interaction variable, a plurality of privacy semantic interaction frequent items to be processed are output, then a plurality of derived active text frequent description items (the derived active text frequent description items can be understood as newly generated active text frequent description items) are obtained according to the plurality of privacy semantic interaction frequent items to be processed, the derived plurality of active text frequent description items are used as input of the next privacy semantic interaction operation, and the steps are repeated to obtain a plurality of target privacy semantic interaction frequent items output by the last privacy semantic interaction operation. The target privacy semantic interaction frequent item can be understood as text semantic features obtained after feature interaction processing, and can reflect a series of activity privacy features of a user in the digital service interaction process.

In each privacy semantic interaction operation process, a plurality of loaded frequent description items of the active text can be correspondingly processed to obtain feature interaction variables. Performing interactive operation on a plurality of frequently-described items of the active text according to the feature interaction variable, wherein the interaction can comprise scene feature interaction and feature description interaction; the scene feature interaction refers to interaction of scene details of a plurality of frequently-described active text items; the characterization interactions may be: a plurality of active text frequent description items are input into a semantic integrated processing layer (such as a full connection layer), and as each active text frequent description item can comprise a plurality of description dimensions, each description dimension corresponds to one feature channel, the semantic integrated processing layer can perform interactive operation on details of different feature channels.

For example, when scene feature interaction is performed on a plurality of frequent description items of the active text according to the feature interaction variable, the target big data desensitization decision network may perform multiplication operation on the feature interaction variable and the plurality of frequent description items of the active text to obtain a plurality of frequent items of the privacy semantic relationship, for example, the feature interaction variable is denoted as Q, and the processed plurality of frequent description items of the active text is denoted as F1, and then F2 is obtained after scene feature interaction is performed according to Q and F1, where f2=q×f1.

Furthermore, the plurality of privacy semantic relation frequent items can be input into the semantic comprehensive processing layer for feature description interaction, and a plurality of privacy semantic interaction frequent items to be processed are obtained. Here, the addition operation can be performed on the plurality of privacy semantic relation frequent items and the plurality of initial active text frequent description items, that is, each privacy semantic relation frequent item and the corresponding active text frequent description item are performed on the addition operation and then input into the semantic comprehensive processing layer.

In the embodiment of the invention, in each privacy semantic interaction operation process, the feature interaction variable can be flexibly and real-timely determined according to the loaded frequent description items of the plurality of active texts, and the frequent description items of the plurality of active texts are interacted according to the flexibly and real-timely determined feature interaction variable, so that the feature interaction variable is more suitable for the frequent description items of the plurality of active texts, and scene feature interaction is better carried out on the frequent description items of the plurality of active texts.

In each privacy semantic interaction operation process, not only can interaction processing be executed on a plurality of loaded active text frequent description items once (including determining feature interaction variables according to the plurality of loaded active text frequent description items and performing interaction operation on the plurality of active text frequent description items according to the feature interaction variables), but also multiple interaction processing can be circularly executed, each interaction processing can obtain a plurality of candidate privacy semantic interaction frequent items according to the last interaction processing to generate derived feature interaction variables, and the plurality of candidate privacy semantic interaction frequent items are subjected to interaction operation according to the derived feature interaction variables until a plurality of to-be-processed privacy semantic interaction frequent items output by the last round of interaction processing are obtained.

For example, when obtaining a plurality of derived frequent description items of the active text according to a plurality of frequent items of the privacy semantic interaction to be processed, the target big data desensitization decision network may perform feature reversible processing (such as convolution operation) on the plurality of frequent items of the privacy semantic interaction to be processed to obtain a plurality of derived frequent description items of the active text.

And 203, calling a target big data desensitization decision network, and carrying out information desensitization decision analysis on the digital service user activity text to be subjected to interactive information desensitization processing according to a plurality of target privacy semantic interaction frequent items to obtain an information desensitization decision label of the digital service user activity text to be subjected to interactive information desensitization processing.

For step 203, the target big data desensitization decision network may obtain a text feature according to a plurality of target privacy semantic interaction frequent items, and perform information desensitization decision analysis on the digital service user activity text to be subjected to the interactive information desensitization processing according to the text feature, so as to obtain the confidence level of the digital service user activity text to be subjected to the interactive information desensitization processing on each preset desensitization decision point (information desensitization decision label), and take the desensitization decision point with the highest confidence level as the information desensitization decision label; the preset desensitization decision views are desensitization decision views which can be output by the target big data desensitization decision network, for example, the desensitization decision views comprise desensitization decision views of different desensitization processing strategies.

In some example embodiments, step 203 may include step 2031 and step 2032.

Step 2031, a target big data desensitization decision network is called, and feature optimization is carried out on a plurality of target privacy semantic interaction frequent items to obtain privacy semantic optimization frequent items.

In the embodiment of the invention, the target big data desensitization decision network can perform feature optimization on a plurality of target privacy semantic interaction frequent items through an average pooling component, such as: the method comprises the steps that a plurality of target privacy semantic interaction frequent items are text features in u dimension, and after feature optimization is conducted on the plurality of target privacy semantic interaction frequent items, a privacy semantic optimization frequent item is obtained, wherein the privacy semantic optimization frequent item (which can be understood as privacy semantic pooling features) is also the text features in u dimension.

Step 2032, invoking a target big data desensitization decision network, and performing information desensitization decision analysis on the digital service user activity text to be subjected to interactive information desensitization processing according to the privacy semantic optimization frequent item to obtain an information desensitization decision tag of the digital service user activity text to be subjected to interactive information desensitization processing.

For step 2032, the target big data desensitization decision network may input the frequent items of privacy semantic optimization into a semantic integrated processing layer, output text features with dimensions of the number of desensitization decision views, and output confidence levels of the desensitization decision views through the classification processing component, and take the desensitization decision view with the highest confidence level as the information desensitization decision tag. Wherein the classification processing component may transform each dimension of the text feature into a likelihood score of between 0 and 1, and the sum of the likelihood scores of all dimensions is equal to 1.

For example, the privacy semantic optimization frequent item is a u-dimensional text feature, the number of desensitization decision views is 24, the text feature is input into a semantic integrated processing layer, the text feature with the dimension of 24 is output, the confidence level of each of the 24 desensitization decision views is output through a classification processing component, and the desensitization decision view with the largest confidence level is used as an information desensitization decision tag.

In the embodiment of the invention, the feature interaction variable can be flexibly and real-time determined according to the loaded frequent description items of the plurality of active texts by each privacy semantic interaction operation, and the frequent description items of the plurality of active texts are interacted by using the flexibly and real-time determined feature interaction variable, so that the privacy semantic interaction quality of a target big data desensitization decision network is improved, various digital service user active texts to be subjected to interaction information desensitization processing are dealt with, the information desensitization decision analysis precision and pertinence of the big data desensitization decision network are obviously improved, and the information desensitization decision label is ensured to accurately and effectively guide the privacy protection processing of the digital service user active texts.

For some possible design considerations, step 202 may include steps 2021-2023 in conjunction with the target big data desensitization decision network described above.

Step 2021, the following steps 2022-2023 are implemented using the repeated feedback rule until several target privacy semantic interaction frequent items are obtained that are output last.

Step 2022, determining, by at least one feature interaction unit of the first branch model, a feature interaction variable according to the loaded plurality of frequent description items of the active text, and outputting a plurality of frequent items of privacy semantic interaction to be processed according to the feature interaction variable and the plurality of frequent description items of the active text.

Step 2023, inputting the plurality of privacy semantic interaction frequent items to be processed into the reversible unit of the first branch model, outputting the plurality of derived active text frequent description items, and taking the plurality of derived active text frequent description items as the input of the next branch model.

In some exemplary embodiments, when a branching model includes a feature interaction unit, the feature interaction unit of the branching model determines the feature interaction variable according to the loaded number of frequent description items of the active text in step 2022, and outputs a number of frequent description items of the privacy semantic interaction to be processed according to the feature interaction variable and the number of frequent description items of the active text, which may include the following.

S1, sequentially passing a plurality of loaded frequent description items of the active text through a first-level numerical mapping unit (normalization unit) and a scene characteristic interaction unit in a characteristic interaction unit, and carrying out scene characteristic interaction on the plurality of frequent description items of the active text to obtain a plurality of frequent items of privacy semantic relation.

The first-stage numerical mapping unit can perform numerical mapping transformation on a plurality of active text frequent description items, then the scene characteristic interaction unit determines characteristic interaction variables according to the plurality of processed active text frequent description items, and scene characteristic interaction is performed on the plurality of processed active text frequent description items by adopting the characteristic interaction variables to obtain a plurality of privacy semantic relation frequent items.

S2, sequentially inputting the plurality of privacy semantic relation frequent items into a second series value mapping unit and a feature description interaction unit, and performing feature description interaction on the plurality of privacy semantic relation frequent items to obtain a plurality of privacy semantic interaction frequent items to be processed.

The second level value mapping unit may perform numerical mapping transformation on the plurality of privacy semantic relation frequent items, and then perform feature description interaction on the plurality of privacy semantic relation frequent items after processing by the feature description interaction unit to obtain a plurality of privacy semantic interaction frequent items to be processed. In addition, a plurality of privacy semantic relation frequent items and a plurality of loaded active text frequent description items can be subjected to addition operation, namely each privacy semantic relation frequent item and a corresponding active text frequent description item are subjected to addition operation, and a plurality of privacy semantic relation frequent items after processing are obtained; and sequentially inputting the plurality of processed privacy semantic relation frequent items into a second series value mapping unit and a feature description interaction unit, and then carrying out addition operation on the plurality of obtained channel privacy semantic interaction frequent items and the plurality of loaded privacy semantic relation frequent items, namely carrying out addition operation on each channel privacy semantic interaction frequent item and the corresponding privacy semantic relation frequent item to obtain a plurality of to-be-processed privacy semantic interaction frequent items.

Under other design ideas, when a branch model includes a plurality of feature interaction units, determining, by the plurality of feature interaction units of the branch model in step 2022, a feature interaction variable according to the loaded plurality of frequent description items of the active text, and outputting a plurality of frequent description items of the privacy semantic interaction to be processed according to the feature interaction variable and the plurality of frequent description items of the active text, which may include the following steps.

1) And determining a feature interaction variable according to the loaded multiple active text frequent description items through the first feature interaction unit, and combining the feature interaction variable and the multiple active text frequent description items to obtain multiple alternative privacy semantic interaction frequent items.

The method can correspondingly process a plurality of frequent description items of the active text to obtain characteristic interaction variables, such as: an interactive frequent item list may be generated according to a plurality of active text frequent description items, and an interactive frequent item list (interactive feature matrix) may be generated according to a portion of the active text frequent description items of the plurality of active text frequent description items.

2) Through the rest of the plurality of characteristic interaction units, the following processing is implemented by using a repeated feedback rule according to a plurality of alternative privacy semantic interaction frequent items: and generating a derived feature interaction variable according to the loaded plurality of alternative privacy semantic interaction frequent items through a feature interaction unit, and obtaining the derived plurality of alternative privacy semantic interaction frequent items and inputting the next feature interaction unit according to the derived feature interaction variable and the plurality of alternative privacy semantic interaction frequent items.

The process of obtaining the derived plurality of candidate privacy semantic interaction frequent items may be combined with the step 1) above, wherein the derived feature interaction variable is generated according to the loaded plurality of candidate privacy semantic interaction frequent items, and the derived plurality of candidate privacy semantic interaction frequent items are obtained according to the derived feature interaction variable and the plurality of candidate privacy semantic interaction frequent items.

3) And taking the plurality of alternative privacy semantic interaction frequent items output by the last feature interaction unit as a plurality of privacy semantic interaction frequent items to be processed.

For some possible design ideas, in view of the fact that the information amount of the plurality of active text frequent description items may be relatively large, in order to improve the interaction efficiency and the privacy semantic interaction quality of the plurality of active text frequent description items, the plurality of active text frequent description items may be disassembled into a plurality of active text frequent description item sets, and then the plurality of active text frequent description item sets are respectively interacted.

In step 2022 above, when determining the characteristic interaction variable from a number of active text frequent description items, examples may include the following.

1. And decomposing the plurality of the frequent description items of the active text into at least one frequent description item set of the active text, wherein different frequent description item sets of the active text have the same number of frequent description items of the active text.

The plurality of active text frequent description items can be arranged in a grid list form (such as a row-column form), and the first constraint value (row number) and the second constraint value (column number) are consistent. The method is characterized in that when the digital business user active text to be subjected to the desensitization processing of the interactive information is disassembled into a plurality of user active text sets according to the fine text granularity, the obtained plurality of user active text sets are arranged according to a grid list form, and after the active text frequent description items of each user active text set are mined, the grid list form active text frequent description items can be obtained. Therefore, when the frequently described items of the grid list form active text are classified, the frequently described items can be processed according to rows and columns; illustratively, each active text frequent description item of each row is disassembled into a set of horizontal active text frequent description items, and each active text frequent description item of each column is disassembled into a set of vertical active text frequent description items, so as to obtain a plurality of active text frequent description item sets.

For example, a plurality of active text frequent description items are adjusted to 16×16, the 16 active text frequent description items in each row are disassembled into a set of horizontal active text frequent description items, and the 16 active text frequent description items in each column are disassembled into a set of vertical active text frequent description items, so that 16×16 active text frequent description item sets are obtained.

2. For at least one active text frequent description item set, the following processing is implemented respectively: and generating an interactive frequent item list according to the frequent description item set of the active text.

Optionally, each active text frequent description item corresponds to a linear array including a plurality of characterization dimensions, and generating the interactive frequent item list according to one active text frequent description item set in the step two may include the following steps:

and performing feature reduction operation on each active text frequent description item in one active text frequent description item set to obtain a plurality of corresponding first reduced linear arrays.

The method comprises the steps of obtaining a plurality of first reduced linear arrays, wherein the plurality of characterization dimensions of each active text frequent description item can be reduced to a few characterization dimensions.

NODE02, combine several first reduced linear arrays into a first stitched linear array and change the first stitched linear array into an interactive frequent item list.

The first reduced linear arrays can be combined into one row to obtain a first spliced linear array; and then generating the features with set sizes from the first spliced linear array through a semantic comprehensive processing layer, and determining a local interaction frequent item list.

NODE03 uses the list of the interaction frequent items corresponding to at least one of the frequent description item sets of the active text as the characteristic interaction variable.

Therefore, the interactive frequent item list is generated according to each active text frequent description item set, on one hand, the interactive efficiency can be improved, the interactive frequent item list can be adapted to active text frequent description items of different distributions of digital service user active texts to be subjected to interactive information desensitization processing, the privacy semantic interaction quality of the active text frequent description items is improved, and the desensitization decision discrimination precision of a big data desensitization decision network is improved.

Further, after obtaining the corresponding interaction frequent item list of at least one frequent description item set of the active text, the target big data desensitization decision network can perform scene feature interaction on each frequent description item set of the active text through a scene feature interaction unit, and perform feature description interaction on a plurality of obtained scene feature interaction units through a feature description interaction unit.

Optionally, in the step 1) above, combining the feature interaction variable and the plurality of frequent description items of the active text, obtaining a plurality of frequent items of alternative privacy semantic interaction may include the following steps.

Step 11), obtaining a plurality of corresponding privacy semantic relation frequent items according to at least one active text frequent description item set and the respective interaction frequent item list.

Each active text frequent description item set can be multiplied with a corresponding interaction frequent item list to obtain a plurality of privacy semantic relation frequent items, and then the privacy semantic relation frequent items corresponding to the active text frequent description item sets are obtained according to the privacy semantic relation frequent items. And obtaining a plurality of corresponding privacy semantic relation frequent items.

Step 12), outputting a plurality of alternative privacy semantic interaction frequent items through a semantic comprehensive processing layer according to the plurality of privacy semantic relationship frequent items.

The feature description interaction can be carried out on a plurality of privacy semantic relation frequent items through the semantic comprehensive processing layer, and a plurality of alternative privacy semantic interaction frequent items are obtained. In addition, after a plurality of privacy semantic relation frequent items are respectively added with corresponding active text frequent description items which are initially loaded, feature description interaction is carried out through a semantic comprehensive processing layer.

For some possible design ideas, each active text frequent description item corresponds to a linear array containing a plurality of characterization dimensions, in order to improve the feature richness, each active text frequent description item can be disassembled into a plurality of description intervals (feature segmentation), the same description intervals of different active text frequent description items are used together to generate a local interaction frequent item list so as to perform scene feature interaction, different description intervals of the same active text frequent description items can be used together to generate variables of a local interaction frequent item list process, but different local interaction frequent item lists can be generated due to different loaded interval frequent items.

Based on this, after describing each of the frequent description items of the one frequent description item set of the active text, the generating the frequent item list of interaction in the second step according to the one frequent description item set of the active text may further include the following steps.

STEP1, for each frequent description item of the frequent description item set of the active text, respectively implement the following processing: disassembling a plurality of characterization dimensions of one active text frequent description item into a plurality of description intervals, and obtaining interval frequent items corresponding to the description intervals; each description interval corresponds to at least one characterization dimension, and the same number of characterization dimensions exist in different description intervals.

For example, taking a text feature with 96 dimensions of an active text frequent description item as an example, the text feature with 96 dimensions may be disassembled into 4 description intervals, that is, the text feature is segmented into 4 segments in 96 dimensions, the dimension of each description interval is 96/4=24 dimensions, and the interval frequent item (that is, 24-dimensional text feature) corresponding to each description interval is obtained.

STEP2, generating a corresponding local interaction frequent item list according to each interval frequent item corresponding to the same description interval aiming at a plurality of interval frequent items corresponding to each activity text frequent description item, and obtaining a local interaction frequent item list corresponding to each of a plurality of description intervals.

Wherein, for several description sections, the following processes may be implemented, respectively.

STEP21, performing feature reduction operation on each interval frequent item corresponding to one description interval to obtain a plurality of corresponding second reduced linear arrays.

Each interval frequent item comprises a plurality of characterization dimensions, the plurality of characterization dimensions of each interval frequent item are compressed to a smaller characterization dimension, and a plurality of second reduced linear arrays are obtained.

STEP22, combining the plurality of second reduced linear arrays into a second spliced linear array, and changing the second spliced linear array into a local interaction frequent item list.

Wherein, a plurality of second reduced linear arrays can be combined into a row to obtain a second spliced linear array; and then generating the features with set sizes by the second spliced linear array through a semantic comprehensive processing layer, and determining a local interaction frequent item list.

STEP3, taking the obtained multiple local interaction frequent item lists as the interaction frequent item list.

In the above steps, the same description intervals of different frequent description items of the active text are used together to generate a local frequent interaction item list and interact the frequent description items of the active text, and the different description intervals of the frequent description items of the active text determine the variables in the process of the frequent interaction item list, but in view of the different loaded interval frequent items, different local frequent interaction item lists can be generated. Therefore, the description intervals of the frequent description items of each active text can be interacted with by adopting different local interaction frequent item lists, and the feature richness is improved.

For example, when generating an interactive frequent item list according to an active text frequent description item set, each active text frequent description item set includes 16 active text frequent description items, each active text frequent description item includes 96 characterization dimensions, each active text frequent description item in the active text frequent description item set is disassembled into 4 description intervals according to the 96 characterization dimensions, and each description interval corresponds to 24 characterization dimensions. After obtaining 4 interval frequent items of each of the 16 active text frequent description items, generating a local interaction frequent item list by using 16 interval frequent items of the same description interval of the 16 active text frequent description items, and obtaining local interaction frequent item lists corresponding to each of the 4 description intervals.

According to the above embodiment of the present invention, the frequent interaction item list includes a plurality of local frequent interaction item lists, and each frequent active text description item in each frequent active text description item set includes: and each description interval corresponds to a local interaction frequent item list.

Under some example embodiments, the target big data desensitization decision network may perform scene feature interaction on each frequent description item set of the active text by using a multiple description interval flexible interaction concept through a scene feature interaction unit. Based on the above, the obtaining a plurality of privacy semantic relation frequent items according to the least one frequent description item set of the active text and the respective frequent interaction item list may include the following steps.

PROCESS1, the following PROCESS11-PROCESS12 are performed for at least one active text frequent description item set, respectively.

And the PROCESS11 performs multiplication processing on each interval frequent item corresponding to the same description interval and a corresponding local interaction frequent item list according to a plurality of interval frequent items corresponding to each active text frequent description item in one active text frequent description item set to obtain a plurality of description interval privacy semantic interaction frequent items corresponding to each of a plurality of description intervals.

For example, each frequent description item of the active text is a 96-dimensional text feature, and is disassembled into 4 description intervals, and each description interval corresponds to a 24-dimensional text feature. The same description interval of the frequently described items of different activity texts. When the same description interval of different active text frequent description items is interacted, for example, the 1 st description interval of w active text frequent description items is interacted together, the local interaction frequent item list is jointly generated by the w 1 st description intervals, and the variable under the stage and the variable of the w 2 nd description intervals which are jointly generated into the local interaction frequent item list can be shared.

The processing 12 aggregates the privacy semantic interaction frequent items of the description intervals corresponding to the same activity text frequent description items according to the privacy semantic interaction frequent items of the description intervals corresponding to the description intervals to obtain a plurality of activity text aggregation frequent item sets corresponding to one activity text frequent description item set.

And 2, aggregating frequent item sets according to a plurality of active texts corresponding to at least one active text frequent description item set respectively to obtain a plurality of privacy semantic relation frequent items.

According to the embodiment of the invention, when the plurality of movable text frequent description items are disassembled into at least one movable text frequent description item set, the plurality of movable text frequent description items can be disassembled into a plurality of transverse movable text frequent description item sets according to rows and a plurality of longitudinal movable text frequent description item sets according to columns.

The method can comprise the following steps when a plurality of privacy semantic relation frequent items are obtained according to a plurality of active text aggregation frequent item sets corresponding to at least one active text frequent description item set.

The processing 21 obtains a horizontal active text aggregation frequent item set (text line feature aggregation result) according to a plurality of active text aggregation frequent item sets corresponding to a plurality of horizontal active text frequent description item sets, and obtains a vertical active text aggregation frequent item set (text column feature aggregation result) according to a plurality of active text aggregation frequent item sets corresponding to a plurality of vertical active text frequent description item sets.

And the PROCESS22 obtains a plurality of privacy semantic relation frequent items according to the horizontal active text aggregation frequent item set and the vertical active text aggregation frequent item set.

For example, for a digital business user activity text to be subjected to interaction information desensitization processing, after the digital business user activity text is disassembled into a plurality of user activity text sets according to the fine text granularity, a plurality of frequent description items of the activity text can be extracted, and under some application scenes with smaller calculation power, multi-description interval flexible interaction can be performed in a separated row.

The target big data desensitization decision network comprises a text processing component, 4 branch models and an average pooling component, wherein the first branch model is formed by cascading 4 feature interaction units and one reversible unit, the first branch model is formed by cascading 3 feature interaction units and one reversible unit, the first branch model is formed by cascading 4 feature interaction units and one reversible unit, and the first branch model is formed by cascading 3 feature interaction units and one reversible unit.

Each feature interaction unit can adopt a flexible interaction thought, generates an interaction frequent item list according to a loaded group of active text frequent description items, and carries out interaction operation on a plurality of active text frequent description items by adopting the interaction frequent item list; and a flexible interaction mode of multiple description intervals can be adopted, a local interaction frequent item list is generated according to the same description interval of the loaded group of active text frequent description items, and the local interaction frequent item list is adopted to carry out interaction operation on a plurality of interval frequent items. The number of the branch models and the number of the feature interaction units of each branch model can be adjusted based on actual requirements.

Under some design considerations which can be implemented independently, after obtaining the information desensitization decision tag of the digital service user activity text, the method further comprises: determining a privacy anonymity strategy of the digital service user active text according to the information desensitization decision tag; and carrying out privacy anonymization processing on the digital service user active text by using the privacy anonymization strategy to obtain the digital service anonymized text.

In the embodiment of the invention, the privacy anonymization strategy can be used for indicating which part of contents in the digital service user activity text is anonymized, and on the basis, the targeted privacy anonymization of the digital service user activity text can be realized, so that the safety of privacy information in the digital service user activity text is ensured.

Under some design ideas which can be implemented independently, after privacy anonymizing processing is performed on the digital service user activity text by using the privacy anonymizing policy, the method further comprises: performing text tree conversion on the digital service anonymous text to obtain an anonymous text tree; storing the anonymous text tree.

It can be appreciated that after obtaining the digital service anonymous text, in order to further implement text privacy protection processing and improve information management efficiency, structured storage of the digital service anonymous text may be implemented. Therefore, on one hand, the protection of privacy information can be further realized through the anonymous text tree, on the other hand, related detail information of the digital service anonymous text can be recorded efficiently through the structured anonymous text tree, and the storage efficiency and the calling and accessing efficiency of the anonymous text tree can be improved.

Under some design ideas which can be implemented independently, the text tree conversion is performed on the digital service anonymous text to obtain an anonymous text tree, which comprises the following steps: obtaining a structured conversion reference text; acquiring an entity detection result in the digital service anonymous text and an entity detection result in the structured conversion reference text; determining entity commonality values of the digital service anonymous text and the structured conversion reference text according to entity detection results in the digital service anonymous text and entity detection results in the structured conversion reference text; determining the text logic characteristics of the digital service anonymous text and the text logic characteristics of the structured conversion reference text; determining a first text relationship commonality value of the digital service anonymous text and the structured conversion reference text according to the text logic characteristics of the digital service anonymous text and the text logic characteristics of the structured conversion reference text; determining a text commonality value between the digital service anonymous text and the structured conversion reference text according to the entity commonality value and the first text relationship commonality value of the digital service anonymous text and the structured conversion reference text; determining a past anonymous text from the structured conversion reference text according to a text commonality value between the digital service anonymous text and the structured conversion reference text; and performing text tree conversion on the digital service anonymous text according to the past anonymous text tree corresponding to the past anonymous text to obtain an anonymous text tree.

Therefore, when the text tree conversion is carried out, the similarity analysis of the structured conversion reference text and the digital service anonymous text can be utilized to accurately determine the past anonymous text, so that the text tree conversion of the digital service anonymous text is rapidly and accurately realized according to the past anonymous text tree corresponding to the past anonymous text, and further the conversion efficiency and the conversion accuracy of the anonymous text tree are improved.

Further, there is also provided a readable storage medium having stored thereon a program which when executed by a processor implements the above-described method.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein. In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

Claims

1. An interactive information desensitizing method for focusing big data privacy security, which is characterized by being applied to an artificial intelligent server, and comprising the following steps:

invoking the target big data desensitization decision network, and carrying out information desensitization decision analysis on the digital service user activity text according to the plurality of target privacy semantic interaction frequent items to obtain an information desensitization decision tag of the digital service user activity text;

The determining the characteristic interaction variable according to the frequent description items of the plurality of active texts loaded at the time comprises the following steps:

taking the interaction frequent item list corresponding to each of the at least one active text frequent description item set as the characteristic interaction variable;

outputting the plurality of privacy semantic interaction frequent items to be processed according to the feature interaction variable and the plurality of activity text frequent description items, wherein the steps comprise:

Taking a plurality of alternative privacy semantic interaction frequent items output in the last round as the plurality of privacy semantic interaction frequent items to be processed;

the obtaining a plurality of derived frequent description items of the active text according to the plurality of frequent items of the privacy semantic interaction to be processed comprises:

2. The method of claim 1, wherein each active text frequent description item corresponds to a linear array comprising a plurality of characterization dimensions, the generating the list of interactive frequent items from a set of active text frequent description items comprising:

3. The method of claim 1, wherein each active text frequent description item corresponds to a linear array comprising a plurality of characterization dimensions, the generating the list of interactive frequent items from a set of active text frequent description items comprising: for each of the frequent description items of the one frequent description item of the active text, the following processing is implemented respectively: disassembling a plurality of characterization dimensions of one active text frequent description item into a plurality of description intervals, and obtaining interval frequent items corresponding to the description intervals; wherein each description interval corresponds to at least one characterization dimension, and the same number of characterization dimensions exist in different description intervals; for a plurality of interval frequent items corresponding to each of the frequent description items of the active text, generating a corresponding local interaction frequent item list according to each interval frequent item corresponding to the same description interval, and obtaining a local interaction frequent item list corresponding to each of a plurality of description intervals; taking the obtained local interaction frequent item lists as the interaction frequent item list;

Generating a corresponding local interaction frequent item list according to each interval frequent item corresponding to the same description interval, and obtaining local interaction frequent item lists corresponding to a plurality of description intervals respectively, wherein the local interaction frequent item list comprises: for a plurality of description intervals, the following processes are respectively implemented: performing characteristic reduction operation on each interval frequent item corresponding to one description interval to obtain a plurality of corresponding second reduced linear arrays; and combining the plurality of second reduced linear arrays into a second spliced linear array, and changing the second spliced linear array into the local interaction frequent item list.

4. The method according to claim 1, wherein the combining the feature interaction variable and the plurality of active text frequent description items to obtain a plurality of alternative privacy semantic interaction frequent items comprises: obtaining a plurality of corresponding privacy semantic relation frequent items through the at least one active text frequent description item set and the respective interaction frequent item list; outputting the plurality of alternative privacy semantic interaction frequent items through a semantic comprehensive processing layer according to the plurality of privacy semantic relationship frequent items;

The interaction frequent item list comprises a plurality of local interaction frequent item lists, and each activity text frequent description item in each activity text frequent description item set comprises: the method comprises the steps that a plurality of description intervals correspond to interval frequent items respectively, and each description interval corresponds to a local interaction frequent item list; the obtaining a plurality of privacy semantic relation frequent items according to the at least one active text frequent description item set and the respective interaction frequent item list includes: and respectively implementing the following processing for the least one active text frequent description item set: according to a set of frequent description items of the active text, each frequent description item of the active text corresponds to a plurality of frequent interval items, multiplication processing is carried out on each frequent interval item corresponding to the same description interval and a corresponding local interaction frequent item list, and a plurality of frequent interval privacy semantic interaction items corresponding to a plurality of description intervals are obtained; aiming at a plurality of description interval privacy semantic interaction frequent items corresponding to a plurality of description intervals, aggregating a plurality of description interval privacy semantic interaction frequent items corresponding to the same activity text frequent description item to obtain a plurality of activity text aggregation frequent item sets corresponding to the one activity text frequent description item set; acquiring a plurality of privacy semantic relation frequent items through a plurality of active text aggregation frequent item sets corresponding to the at least one active text frequent description item set respectively;

The step of disassembling the plurality of active text frequent description items into at least one active text frequent description item set includes: the plurality of movable text frequent description items are disassembled into a plurality of transverse movable text frequent description item sets according to rows, and are disassembled into a plurality of longitudinal movable text frequent description item sets according to columns, so that at least one movable text frequent description item set is obtained; the plurality of active text frequent description items are adjusted to be in a grid list form, and the first constraint value is consistent with the second constraint value; the obtaining the plurality of privacy semantic relation frequent items through the plurality of active text aggregation frequent item sets corresponding to the at least one active text frequent description item set respectively includes: acquiring a transverse active text aggregation frequent item set through a plurality of active text aggregation frequent item sets corresponding to the plurality of transverse active text frequent description item sets respectively, and acquiring a longitudinal active text aggregation frequent item set through a plurality of active text aggregation frequent item sets corresponding to the plurality of longitudinal active text frequent description item sets respectively; and obtaining the privacy semantic relation frequent items according to the horizontal active text aggregation frequent item set and the vertical active text aggregation frequent item set.

5. The method of claim 1, wherein the invoking the target big data desensitization decision network to perform information desensitization decision analysis on the digital service user activity text according to the plurality of target privacy semantic interaction frequent items to obtain an information desensitization decision tag of the digital service user activity text comprises:

6. An artificial intelligence server, comprising a processor and a memory; the processor being communicatively connected to the memory, the processor being adapted to read a computer program from the memory and execute it to carry out the method of any of the preceding claims 1-5.

7. A computer readable storage medium, characterized in that it has stored thereon a computer program, which, when run, implements the method of any of the preceding claims 1-5.