CN117272381A - Desensitization method, device, equipment and medium for sensitive data - Google Patents

Desensitization method, device, equipment and medium for sensitive data Download PDF

Info

Publication number
CN117272381A
CN117272381A CN202311257397.8A CN202311257397A CN117272381A CN 117272381 A CN117272381 A CN 117272381A CN 202311257397 A CN202311257397 A CN 202311257397A CN 117272381 A CN117272381 A CN 117272381A
Authority
CN
China
Prior art keywords
character
desensitization
data
current
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311257397.8A
Other languages
Chinese (zh)
Inventor
李刚
陈锐
程强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN202311257397.8A priority Critical patent/CN117272381A/en
Publication of CN117272381A publication Critical patent/CN117272381A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for desensitizing sensitive data, wherein the method comprises the following steps: acquiring data to be desensitized and desensitization offset parameters; determining a current desensitization character in the data to be desensitized and an offset parameter character in the desensitization offset parameter; determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character; in accordance with the desensitized character of each object, determining target desensitization data corresponding to the data to be desensitized; and according to the target desensitization data, desensitizing the data to be desensitized is realized. By the scheme, the desensitization of the sensitive data is realized, the security of the data to be desensitized is improved, and the leakage of the data to be desensitized is avoided.

Description

Desensitization method, device, equipment and medium for sensitive data
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a method, a device, equipment and a medium for desensitizing sensitive data.
Background
Currently, in a large digital economy setting, large amounts of data are stored and processed on a variety of information systems, including large amounts of valuable sensitive data. The sensitive data may be primarily personal privacy data and business data, there is also core data with a very high security level after data classification. However, in the process of data collection, transmission, exchange and sharing, data is not well protected, resulting in data leakage. Therefore, how to achieve desensitization of sensitive data is of great importance.
Disclosure of Invention
The invention provides a desensitization method, device, equipment and medium for sensitive data, so as to realize desensitization of the sensitive data.
According to an aspect of the present invention, there is provided a method of desensitizing sensitive data, comprising:
acquiring data to be desensitized and desensitization offset parameters;
determining a current desensitization character in the data to be desensitized, and an offset parameter character in the desensitization offset parameter;
determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character;
determining target desensitization data corresponding to the data to be desensitized according to each target desensitization character;
and according to the target desensitization data, desensitizing the data to be desensitized is realized.
According to another aspect of the present invention, there is provided a desensitizing apparatus for sensitive data, characterized by comprising:
the data acquisition module is used for acquiring data to be desensitized and desensitization offset parameters;
the character acquisition module is used for determining the current desensitization character in the data to be desensitized and the offset parameter character in the desensitization offset parameter;
the target desensitization character determining module is used for determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character;
the target desensitization data determining module is used for determining target desensitization data corresponding to the data to be desensitized according to each target desensitization character;
and the data desensitization module is used for realizing desensitization of the data to be desensitized according to the target desensitization data.
According to another aspect of the present invention, there is provided an electronic apparatus including:
one or more processors;
a memory for storing one or more programs;
when executed by one or more processors, the one or more programs enable the one or more processors to perform any one of the methods for desensitizing sensitive data provided by embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions that, the computer instructions are configured to cause the processor to implement any one of the methods for desensitizing sensitive data provided by the embodiments of the present invention when executed.
According to the scheme for desensitizing the sensitive data, provided by the embodiment of the invention, the target desensitizing character is obtained by introducing the desensitizing offset parameter and desensitizing each current desensitizing character in the data to be desensitized according to the offset parameter character in the desensitizing offset parameter, and the target desensitized data after desensitizing is obtained based on the target desensitizing character, so that the desensitization of the sensitive data is realized, the safety of the data to be desensitized is improved, and the leakage of the data to be desensitized is avoided.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the invention will be seen through the following and will be readily understood from the description of (2).
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for desensitizing sensitive data according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for desensitizing sensitive data according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a sensitive data desensitizing device according to a fourth embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device for implementing a desensitization method of sensitive data according to a fifth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Data desensitization is a common term in the field of data science, and refers to a technical measure for processing sensitive fields in original data on the premise of not affecting the accuracy of data analysis results, so as to reduce data sensitivity and personal privacy risks.
Modulo arithmetic refers to dividing one integer by another and returning the remainder portion.
Example 1
Fig. 1 is a flowchart of a method for desensitizing sensitive data according to an embodiment of the present invention, where the embodiment is applicable to a case of desensitizing sensitive data to ensure security of the sensitive data, the method may be performed by a device for desensitizing sensitive data, where the device may be implemented in software and/or hardware, and may be configured in an electronic device that carries a desensitizing function of the sensitive data.
Referring to the method for desensitizing sensitive data shown in fig. 1, the method comprises:
s110, acquiring data to be desensitized and desensitization offset parameters.
The data to be desensitized refers to sensitive data which need to be desensitized. By way of example, the data to be desensitized may be sensitive data. It should be noted that, the type of the data to be desensitized in the embodiment of the present invention is not limited, and may be acquired by a technician according to needs. Illustratively, the data to be desensitized may be composed of at least one of numeric characters and english characters, e.g., the data to be desensitized may be "112aBc", or the data to be desensitized may be "112345", or the data to be desensitized may be "achDby".
The desensitization offset parameter refers to parameter data for desensitizing the data to be desensitized. Illustratively, the desensitization offset parameter may be composed of at least one of a numeric character and an english character, e.g., the desensitization offset parameter may be "test", or the desensitization offset parameter may be "26788", or the desensitization offset parameter may be "test127". It should be noted that, in the embodiment of the present invention, the number of characters in the desensitization offset parameter is not limited, and may be set by a technician according to experience or needs.
S120, determining the current desensitization character in the data to be desensitized and the offset parameter character in the desensitization offset parameter.
Wherein, the current desensitization character refers to any character data in the data to be desensitized, if the data to be desensitized is '1234', the current desensitization character can be any one of 1, 2, 3 and 4. The offset parameter character refers to offset character data in the desensitization offset parameter corresponding to the current desensitization character.
It should be noted that, in the embodiment of the present invention, there is a position correspondence between the current desensitization character and the offset parameter character, that is, the parameter at the corresponding position in the desensitization offset parameter is used as the offset parameter character corresponding to the current desensitization character according to the position of the current desensitization character in the data to be desensitized. For example, if the data to be desensitized is "234A", the desensitization offset parameter is "tesy", and if the current desensitization character is 2, the corresponding offset parameter character is t; if the current desensitization character is 3, the corresponding offset parameter character is e; if the current desensitization character is 4, the corresponding offset parameter character is s; and if the current desensitization character is A, the corresponding offset parameter character is y.
If the number of characters in the desensitization offset parameter is smaller than the number of characters in the data to be desensitized, traversing the desensitization offset parameter again, and further determining the offset parameter characters corresponding to the current desensitization characters. For example, if the data to be desensitized is "12345", the desensitization offset parameter is "test", and since the byte length of the desensitization offset parameter is smaller than the byte length of the data to be desensitized (i.e. the number of characters in the desensitization offset parameter is smaller than the number of characters in the data to be desensitized), when the current desensitization character is 5, the desensitization offset parameter is re-traversed, i.e. t in the desensitization offset parameter is taken as the offset parameter character of the current desensitization character 5.
Specifically, determining a current desensitization character in data to be desensitized; and determining the offset parameter character corresponding to the current desensitization character according to the position of the current desensitization character in the data to be desensitized and the number of characters in the desensitization offset parameter.
S130, determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character.
The target desensitization character refers to a character subjected to desensitization on the current desensitization character. Specifically, the current desensitization character has a corresponding relation with the target desensitization character.
And S140, determining target desensitization data corresponding to the data to be desensitized according to the target desensitization characters.
The target desensitization data refers to data after desensitization of the data to be desensitized. Specifically, there is a correspondence between target desensitization data and data to be desensitized.
Specifically, the target desensitization characters are ordered according to the positions of the corresponding current desensitization characters in the data to be desensitized, so that the target desensitization data are obtained.
S150, according to the target desensitization data, desensitization of the data to be desensitized is achieved.
According to the scheme for desensitizing the sensitive data, provided by the embodiment of the invention, the target desensitizing character is obtained by introducing the desensitizing offset parameter and desensitizing each current desensitizing character in the data to be desensitized according to the offset parameter character in the desensitizing offset parameter, and the target desensitized data after desensitizing is obtained based on the target desensitizing character, so that the desensitization of the sensitive data is realized, the safety of the data to be desensitized is improved, and the leakage of the data to be desensitized is avoided.
Implementation of the embodiments example two
FIG. 2 is a flow chart of a method for desensitizing sensitive data according to a second embodiment of the present invention, where the present embodiment further refines the operation of determining a target desensitization character corresponding to a current desensitization character according to the current desensitization character and an offset parameter character into the character category of determining the current desensitization character based on the above embodiments; wherein the character categories include a numeric character category and an English character category; determining standard offset characters of the current desensitization character according to the character types; and determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character and the offset parameter character so as to perfect a determination mechanism of the target desensitization character. In the portions of the embodiments of the present invention that are not described in detail, reference may be made to the descriptions of other embodiments.
Referring to the method for desensitizing sensitive data shown in fig. 2, the method comprises:
s210, acquiring data to be desensitized and desensitization offset parameters.
S220, determining the current desensitization character in the data to be desensitized and the offset parameter character in the desensitization offset parameter.
S230, determining the character type of the current desensitization character.
Wherein, the character category refers to the category of the current desensitization character. Exemplary character categories include numeric character categories and English character categories. Wherein, the number character class refers to that the current desensitization character is a number. English character class refers to the current desensitized character being English. In the embodiment of the invention, the English character type does not limit the English case format, namely the current desensitization character can be a capital English character or a lowercase English character, and the English character type is the English character type.
In an alternative embodiment, if the character class is an english character class, then after determining the character class of the current desensitized character, the method further comprises: and performing case-to-case format conversion on the current desensitized character.
Specifically, if at least two english characters exist in the data to be desensitized, the case format of the english characters is unified. For example, each english character in the data to be desensitized may be converted to uppercase format.
It can be appreciated that the current desensitization character is conveniently processed by performing case-to-case format conversion on the current desensitization character, so that the efficiency of subsequent processing of the current desensitization character is improved.
S240, determining standard offset characters of the current desensitization characters according to the character types.
Wherein the standard offset character refers to a character used to determine the offset of the current desensitization character. For example, if the character class is a numeric character class, the standard offset character of the current desensitization character is "0"; if the character class is English, the standard deviation character of the current desensitization character is "A".
S250, determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character and the offset parameter character.
In an alternative embodiment, determining the target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character and the offset parameter character comprises: determining current coding data of the current desensitization character according to the character category and the offset parameter character; determining the character offset of the current desensitization character according to the standard offset character and the current desensitization character; and determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current coding data and the character offset.
The current coding data refers to the corresponding coding data of the current desensitization character. The embodiment of the invention does not limit the current coded data at all, and can be set by a technician according to experience. Alternatively, the current encoded data corresponding to the current desensitization character may be determined according to ASCII (American Standard Code for Information Interchange ). Or alternatively, the current encoded data corresponding to the different current desensitization characters may be set empirically or as desired by the skilled artisan.
Wherein the character offset is used to quantify the offset size between the current desensitization character and the standard offset character. By way of example, the character offset may be determined by the following formula:
O(i)=D(i)-D standard of
Wherein O (i) represents the character offset of the current desensitization character; d (i) represents a current desensitization character; d (D) Standard of Representing standard offset characters. Wherein, if the character class of the current desensitization character is a digital character class, the standard deviation character D Standard of Is "0"; if the character type of the current desensitization character is English character type, standard deviation character D Standard of Is "A".
It will be appreciated that by introducing the current encoded data and character offset, the accuracy of the determined target desensitized character is improved.
In an alternative embodiment, determining current encoded data for a current desensitization character based on the character class and the offset parameter character comprises: determining a character interval threshold according to the character category; determining offset coding data of offset parameter characters; and determining the current coding data of the current desensitization character according to the character interval threshold value and the offset coding data.
Wherein the character interval threshold is used for a value that processes the current encoded data. Illustratively, the character interval threshold limits the data size of the current encoded data. For example, if the character class is a digital character class, the character interval threshold is 10; if the character class is English, the character interval threshold is 26.
The offset coded data refers to coded data corresponding to offset parameter characters. For example, the offset encoded data may be determined from ASCII.
Illustratively, the current encoded data for the current desensitization character may be determined by the following formula:
M(j)=Mod(A1(j),Δs);
wherein M (j) represents current encoded data of the current desensitization character; mod represents modulo; a1 (j) represents offset encoded data; Δs represents a character interval threshold. If the character class is a digital character class, the character interval threshold value deltas is 10; if the character type is an english character type, the character segment threshold Δs is "26".
It can be understood that by introducing the character interval threshold value, the numerical value of the current coding data is limited, the situation that at least two bits of data appear in the current coding data corresponding to one current desensitization character due to the fact that the current coding data is larger is avoided, and the accuracy of the corresponding relation between the current desensitization character and the current coding data is improved. Meanwhile, according to the character interval threshold value and the offset coding data, the current coding data of the current desensitization character is determined, and the accuracy of the determined current coding data is improved.
In an alternative embodiment, determining the target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current encoded data and the character offset comprises: determining reference coding data according to a character interval threshold value corresponding to the character category, current coding data and a character offset; determining standard coding data of standard offset characters; determining target coding data according to the reference coding data and the standard coding data; and determining corresponding target desensitization characters according to the target coding data.
Wherein the reference encoded data refers to intermediate data for determining target encoded data.
In an alternative embodiment, determining the reference encoded data according to the character interval threshold corresponding to the character class, the current encoded data, and the character offset includes: determining data and values between the current encoded data and the character offset; and determining the reference coded data according to the ratio between the data sum value and the character interval threshold value.
Specifically, adding the current coded data and the character offset to obtain data and a value; and performing modular operation on the data sum value and the character interval threshold value to obtain reference coded data, namely, taking the remainder in the ratio between the data sum value and the character interval threshold value as the reference coded data.
By way of example, the reference encoded data may be determined by the following formula:
A reference to =Mod((O(i)+M(j)),Δs);
Wherein A is Reference to Representing reference encoded data.
It can be appreciated that by determining the reference encoded data based on the data and the values and character intervals, the accuracy of the determined reference encoded data is improved.
By way of example, the target encoded data may be determined by the following formula:
A2(i)=A reference to +A Standard of
Wherein A2 (i) represents target encoded data; a is that Standard of Representing standard encoded data. Wherein, if the character class of the current desensitization character is a digital character class, the standard coding data A Standard of 48 (48 is an ASCII code value of the number "0"); if the character type of the current desensitization character is English character type, standard coding data A Standard of 65 (65 is the ASCII code value of the character "a").
Further, generating the data character after the current desensitization character is desensitized according to the target coding data, namely the target desensitization character.
It will be appreciated that by introducing reference encoded data and standard encoded data, the accuracy of the determined target encoded data and thus the accuracy of the determined target desensitized character is improved.
And S260, determining target desensitization data corresponding to the data to be desensitized according to the target desensitization characters.
S270, according to the target desensitization data, desensitization of the data to be desensitized is realized.
The embodiment of the invention provides a desensitization scheme of sensitive data, which realizes the differential determination of standard offset characters according to character types by introducing character types and standard offset characters, and improves the accuracy of the determined target desensitization characters.
On the basis of the technical scheme, the embodiment of the invention provides a method for restoring the sensitive data, and the sensitive data can be restored by reversing the desensitizing method of the sensitive data. Exemplary, target desensitization data corresponding to the data to be restored are obtained; determining a current restoring character in the target desensitization data and target coding data corresponding to the current restoring character; determining reference encoded data according to a difference between the target encoded data and the standard encoded data; determining an offset parameter character corresponding to the current restoring character in the desensitization offset parameter according to the current restoring character; determining offset coded data according to the offset parameter characters; determining a character interval threshold value and standard offset characters according to the character category of the current restored character; determining character offset according to the reference code data, the offset code data and the character interval threshold; determining an original character corresponding to the current restoring character according to the character offset and the standard offset character; and determining the data to be restored according to each original character.
The data to be restored refers to desensitized data which needs to be restored. The current restoring character refers to a character which is needed to restore the data currently in the data to be restored. The original character refers to a character in the data to be restored which is not desensitized.
Example III
On the basis of the technical scheme, the embodiment of the invention provides an optional desensitization method for sensitive data. In the portions of the embodiments of the present invention that are not described in detail, reference may be made to the descriptions of other embodiments.
In the embodiment of the invention, data to be desensitized and a desensitization offset parameter Param are set; initializing a Data reading position, namely setting an initial reading position i=0 of Data to be desensitized, setting an initial reading position j=0 of a desensitization offset parameter Param, and sequentially reading the Data according to characters; reading the ith character of the Data to be desensitized as a current desensitization character D (i); the j-th character P (j) in the desensitization offset parameter Param is read as an offset parameter character, and offset coded data A1 (j) of the offset parameter character P (j) is acquired.
Optionally, if the character class of the current desensitization character D (i) is a digital character class, calculating a character offset O (i) of the current desensitization character D (i) from the standard offset character "0" (i.e., how many offset positions the current desensitization character has from the standard offset character 0); determining offset coding data A1 (j) of an offset parameter character P (j), and taking a module of a character interval threshold 10 to obtain current coding data M (j); and generating target code data A2 (i) after desensitization by taking the modulus of 10 by using the sum of O (i) and M (j), and generating the target desensitization character after the ith desensitization according to the value A2 (i).
Or alternatively, if the character class of the current desensitization character D (i) is an english character class, calculating a character offset O (i) of the current desensitization character D (i) from the standard offset character "a" (i.e., how many offset positions the current desensitization character has from the standard offset character a); determining offset coding data A1 (j) of an offset parameter character P (j), and taking a module of a character interval threshold 26 to obtain current coding data M (j); and generating target code data A2 (i) after desensitization by using the sum of O (i) and M (j) and taking a model of 26, and generating the target desensitization character after the ith desensitization according to the value A2 (i).
Further, if the data to be desensitized are not completely processed, setting a new reading position i=i+1 of the data to be desensitized; setting a new reading position j=j+1 of the desensitization offset parameter, and if j is greater than the total length of the desensitization offset parameter, j=0; the execution is restarted with the new data read position. If the desensitization data are processed completely, connecting all the generated single target desensitization characters in series to generate a complete desensitization character string, namely the target desensitization data.
For example, if the Data to be desensitized= "112aBc", the desensitization offset parameter param= "test", the target desensitization Data after desensitization is generated as "863FKJ".
In the embodiment of the invention, the input data to be desensitized is processed through an offset algorithm, bit-wise offset is carried out according to the set desensitization offset parameter, the mode-taking processing is carried out when the input data exceeds the range (numerical value or character), and the reverse operation is carried out during the restoration.
The desensitization method for the sensitive data provided by the embodiment of the invention realizes desensitization of the sensitive data of the numbers and the English characters by using offset calculation, the original data characteristics are still maintained after the data is desensitized, and the desensitized data can be restored. The content after data desensitization maintains the original data characteristics and service rules, and ensures that the data before and after desensitization has consistency and effectiveness without being influenced by desensitization in development, test and data analysis services.
Example IV
Fig. 3 is a schematic diagram of a sensitive data desensitizing apparatus according to a fourth embodiment of the present invention, where the present embodiment is applicable to desensitizing sensitive data to ensure security of the sensitive data, and the method may be performed by the sensitive data desensitizing apparatus, where the apparatus may be implemented in software and/or hardware, and may be configured in an electronic device that carries a sensitive data desensitizing function.
As shown in fig. 3, the apparatus includes: a data acquisition module 310, a character acquisition module 320, a target desensitization character determination module 330, a target desensitization data determination module 340, and a data desensitization module 350. Wherein,
a data acquisition module 310, configured to acquire data to be desensitized and desensitized offset parameters;
a character acquisition module 320, configured to determine a current desensitization character in the data to be desensitized, and an offset parameter character in the desensitization offset parameter;
a target desensitization character determining module 330, configured to determine a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character;
the target desensitization data determining module 340 is configured to determine target desensitization data corresponding to the data to be desensitized according to each target desensitization character;
the data desensitization module 350 is configured to implement desensitization of the data to be desensitized according to the target desensitization data.
According to the scheme for desensitizing the sensitive data, provided by the embodiment of the invention, the target desensitizing character is obtained by introducing the desensitizing offset parameter and desensitizing each current desensitizing character in the data to be desensitized according to the offset parameter character in the desensitizing offset parameter, and the target desensitized data after desensitizing is obtained based on the target desensitizing character, so that the desensitization of the sensitive data is realized, the safety of the data to be desensitized is improved, and the leakage of the data to be desensitized is avoided.
Optionally, the target desensitization character determination module 330 includes:
a character category determining unit configured to determine a character category of the current desensitized character; wherein the character categories include a numeric character category and an English character category;
a standard offset character determining unit, configured to determine a standard offset character of the current desensitization character according to the character class;
and the target desensitization character determining unit is used for determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character and the offset parameter character.
Optionally, the target desensitization character determination unit includes:
a current coding data determining subunit, configured to determine current coding data of the current desensitization character according to the character category and the offset parameter character;
a character offset determining subunit, configured to determine, according to the standard offset character and the current desensitization character, a character offset of the current desensitization character;
and the target desensitization character determining subunit is used for determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current coding data and the character offset.
Optionally, the current encoded data determining subunit is specifically configured to:
determining a character interval threshold according to the character category;
determining offset coded data of the offset parameter character;
and determining the current coding data of the current desensitization character according to the character interval threshold value and the offset coding data.
Optionally, the target desensitization character determination unit includes:
the reference coding data determining subunit is used for determining reference coding data according to the character interval threshold value corresponding to the character category, the current coding data and the character offset;
a standard code data determination subunit configured to determine standard code data of the standard offset character;
a target encoded data determination subunit configured to determine target encoded data according to the reference encoded data and the standard encoded data;
and the target desensitization character determining subunit is used for determining corresponding target desensitization characters according to the target coding data.
Optionally, the determining subunit with reference to the encoded data is specifically configured to:
determining data and values between the current encoded data and the character offset;
and determining the reference coded data according to the ratio between the data sum value and the character interval threshold value.
Optionally, if the character class is an english character class, the apparatus further includes:
and the format conversion unit is used for carrying out case format conversion on the current desensitization character.
The sensitive data desensitizing device provided by the embodiment of the invention can execute the sensitive data desensitizing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the sensitive data desensitizing method.
In the technical scheme of the invention, the related processes of collection, storage, use, processing, transmission, provision, disclosure and the like of the data to be desensitized, the desensitization offset parameters, the standard offset characters and the like all conform to the regulations of related laws and regulations and do not violate the popular public order
Example five
Fig. 4 is a schematic diagram of an electronic device implementing a method for desensitizing sensitive data according to a fifth embodiment of the present invention, where the electronic device 410 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 410 includes at least one processor 411, and a memory, such as a Read Only Memory (ROM) 412, a Random Access Memory (RAM) 413, etc., communicatively connected to the at least one processor 411, wherein the memory stores computer programs executable by the at least one processor, and the processor 411 may perform various suitable actions and processes according to the computer programs stored in the Read Only Memory (ROM) 412 or the computer programs loaded from the storage unit 418 into the Random Access Memory (RAM) 413. In the RAM 413, various programs and data required for the operation of the electronic device 410 may also be stored. The processor 411, the ROM 412, and the RAM 413 are connected to each other through a bus 414. An input/output (I/O) interface 415 is also connected to bus 414.
Various components in the electronic device 410 are connected to the I/O interface 415, including: an input unit 416 such as a keyboard, a mouse, etc.; an output unit 417 such as various types of displays, speakers, and the like; a storage unit 418, such as a magnetic disk, optical disk, or the like; and a communication unit 419 such as a network card, modem, wireless communication transceiver, etc. The communication unit 419 allows the electronic device 410 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The processor 411 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 411 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 411 performs the various methods and processes described above, such as the desensitization method of sensitive data.
In some embodiments, the method of desensitizing sensitive data may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 418. In some embodiments, some or all of the computer program may be loaded and/or installed onto the electronic device 410 via the ROM 412 and/or the communication unit 419. When the computer program is loaded into RAM 413 and executed by processor 411, one or more steps of the method of desensitizing sensitive data described above may be performed. Alternatively, in other embodiments, the processor 411 may be configured to perform the desensitization method of sensitive data in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of desensitizing sensitive data, comprising:
acquiring data to be desensitized and desensitization offset parameters;
determining a current desensitization character in the data to be desensitized and an offset parameter character in the desensitization offset parameter;
determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character;
determining target desensitization data corresponding to the data to be desensitized according to each target desensitization character;
and according to the target desensitization data, desensitizing the data to be desensitized is realized.
2. The method of claim 1, wherein the determining the target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character comprises:
determining a character class of the current desensitization character; wherein the character categories include a numeric character category and an English character category;
determining standard offset characters of the current desensitization character according to the character category;
and determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character and the offset parameter character.
3. The method of claim 2, wherein the determining the target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current desensitization character, and the offset parameter character comprises:
determining current coding data of the current desensitization character according to the character category and the offset parameter character;
determining the character offset of the current desensitization character according to the standard offset character and the current desensitization character;
and determining a target desensitization character corresponding to the current desensitization character according to the character category, the standard offset character, the current coding data and the character offset.
4. A method according to claim 3, wherein said determining current encoded data for said current desensitization character based on said character categories and said offset parameter characters comprises:
determining a character interval threshold according to the character category;
determining offset coded data of the offset parameter character;
and determining the current coding data of the current desensitization character according to the character interval threshold value and the offset coding data.
5. The method of claim 4, wherein determining a target desensitization character corresponding to the current desensitization character based on the character category, the standard deviation character, the current encoded data, and the character offset comprises:
determining reference coded data according to a character interval threshold value corresponding to the character category, the current coded data and the character offset;
determining standard coding data of the standard deviation character;
determining target coded data according to the reference coded data and the standard coded data;
and determining corresponding target desensitization characters according to the target coding data.
6. The method of claim 5, wherein the determining the reference encoded data based on the character interval threshold corresponding to the character class, the current encoded data, and the character offset comprises:
determining data and values between the current encoded data and the character offset;
and determining the reference coded data according to the ratio between the data sum value and the character interval threshold value.
7. The method of any of claims 2-6, wherein if the character class is an english character class, the method further comprises:
and performing case-to-case format conversion on the current desensitization character.
8. A device for desensitizing sensitive data, comprising:
the data acquisition module is used for acquiring data to be desensitized and desensitization offset parameters;
the character acquisition module is used for determining the current desensitization character in the data to be desensitized and the offset parameter character in the desensitization offset parameter;
the target desensitization character determining module is used for determining a target desensitization character corresponding to the current desensitization character according to the current desensitization character and the offset parameter character;
the target desensitization data determining module is used for determining target desensitization data corresponding to the data to be desensitized according to each target desensitization character;
and the data desensitization module is used for realizing desensitization of the data to be desensitized according to the target desensitization data.
9. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, causes the one or more processors to implement a method of desensitizing sensitive data according to any of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a method of desensitizing sensitive data according to any of claims 1-7.
CN202311257397.8A 2023-09-26 2023-09-26 Desensitization method, device, equipment and medium for sensitive data Pending CN117272381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311257397.8A CN117272381A (en) 2023-09-26 2023-09-26 Desensitization method, device, equipment and medium for sensitive data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311257397.8A CN117272381A (en) 2023-09-26 2023-09-26 Desensitization method, device, equipment and medium for sensitive data

Publications (1)

Publication Number Publication Date
CN117272381A true CN117272381A (en) 2023-12-22

Family

ID=89208836

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311257397.8A Pending CN117272381A (en) 2023-09-26 2023-09-26 Desensitization method, device, equipment and medium for sensitive data

Country Status (1)

Country Link
CN (1) CN117272381A (en)

Similar Documents

Publication Publication Date Title
CN112580733B (en) Classification model training method, device, equipment and storage medium
CN114494784A (en) Deep learning model training method, image processing method and object recognition method
CN112529197A (en) Quantum state fidelity determination method, device, equipment and storage medium
CN115454706A (en) System abnormity determining method and device, electronic equipment and storage medium
CN113963197A (en) Image recognition method and device, electronic equipment and readable storage medium
CN116796085A (en) File processing method and device, electronic equipment and storage medium
CN117272381A (en) Desensitization method, device, equipment and medium for sensitive data
CN115641481A (en) Method and device for training image processing model and image processing
CN114254650A (en) Information processing method, device, equipment and medium
CN117648999B (en) Federal learning regression model loss function evaluation method and device and electronic equipment
CN116341023B (en) Block chain-based service address verification method, device, equipment and storage medium
CN115482422B (en) Training method of deep learning model, image processing method and device
CN113553407B (en) Event tracing method and device, electronic equipment and storage medium
CN117615137B (en) Video processing method, device, equipment and storage medium
CN115511047B (en) Quantification method, device, equipment and medium of Softmax model
CN115934987A (en) Sample text generation method and device and electronic equipment
CN118051670A (en) Service recommendation method, device, equipment and medium
CN118245676A (en) Recommendation method and device for operation object, electronic equipment and storage medium
CN117668908A (en) Data desensitizing method, device, electronic equipment and storage medium
CN117313159A (en) Data processing method, device, equipment and storage medium
CN118300745A (en) Message adjustment method, device, equipment and storage medium
CN115794830A (en) Data value determination method and device, electronic equipment and storage medium
CN114942996A (en) Triple construction method and device of vertical industry data, electronic equipment and medium
CN116166506A (en) System operation data processing method, device, equipment and storage medium
CN117455684A (en) Data processing method, device, electronic equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination