CN113779616A - Method and apparatus for identifying data - Google Patents
Method and apparatus for identifying data Download PDFInfo
- Publication number
- CN113779616A CN113779616A CN202110180902.8A CN202110180902A CN113779616A CN 113779616 A CN113779616 A CN 113779616A CN 202110180902 A CN202110180902 A CN 202110180902A CN 113779616 A CN113779616 A CN 113779616A
- Authority
- CN
- China
- Prior art keywords
- data
- service application
- application
- sensitive data
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 230000015654 memory Effects 0.000 claims description 19
- 238000005516 engineering process Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 10
- 238000012795 verification Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 9
- 230000004044 response Effects 0.000 description 5
- 239000008280 blood Substances 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2141—Access rights, e.g. capability lists, access control lists, access tables, access matrices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Storage Device Security (AREA)
Abstract
The application discloses a method and a device for identifying data, and the specific implementation scheme is as follows: responding to a received user access request sent by a client, analyzing the user access request to obtain a user identifier, an access address and each service application provided for the request; generating a link tracking identifier based on the access address and each service application; based on the link tracking identification, obtaining application information of each service application from a database, wherein the application information comprises: interface information and execution codes are obtained by updating the database in advance by using the link tracking identifier; generating authority resource codes of all service applications based on the user identification and the interface information of all service applications; and identifying the authority resource codes of the service applications to obtain each sensitive data represented by each authority resource code and the data type corresponding to each sensitive data. The scheme realizes a data identification method for automatically identifying the sensitive data represented by the authority resource code and the type of the sensitive data.
Description
Technical Field
Embodiments of the present application relate to the field of computer technologies, and in particular, to the field of data processing technologies, and in particular, to a method and an apparatus for identifying data.
Background
The authority platform is a standard role-based access control (RBAC for short) authority management platform, and a service requiring authority control is applied to the authority platform to register and apply for an authority resource code, and then the authority of the resource code is managed through a role.
At present, a permission platform cannot identify which sensitive data are represented behind a permission resource code, a permission approver does not have approval basis, the judgment can be carried out only by the application reason written by an applicant and own experience, and the permission management and the examination are difficult. The flow data of the http message does not know which back-end application associated with the URL in the flow, namely the access address, and which authority resource code corresponding to the back-end application is, so that the asset is difficult to locate, and the sensitive data is fuzzy, so that the key protection is difficult to achieve.
Disclosure of Invention
A method, apparatus, device, and storage medium for identifying data are provided.
According to a first aspect of the present application, there is provided a method for identifying data, the method comprising: responding to a received user access request sent by a client, analyzing the user access request, and obtaining a user identifier, an access address and each service application provided for the request, which correspond to the request; generating a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application, wherein the link tracking identifier is used for representing an incidence relation between the access address and each service application; based on the link tracking identification, obtaining application information of each service application from a database, wherein the application information comprises: interface information and execution codes are obtained by updating the database in advance by using the link tracking identifier; generating authority resource codes of each service application corresponding to each interface information based on the user identification and the interface information of each service application, wherein the authority resource codes are used for representing resource information used for performing authority verification on the request; and identifying the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the execution codes of each service application with the classification data aiming at the sensitive data in the metadata of the database.
In some embodiments, the update process for the database is as follows: binding the link tracking identification with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application; and updating the database based on the interface information corresponding to each bound service application and the execution code corresponding to each bound service application.
In some embodiments, the binding the link tracking identifier with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application, includes: and based on a link tracking technology, binding the link tracking identification with the application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application, wherein the link tracking technology is used for representing that point embedding is carried out at the corresponding position of the application information corresponding to each service application by using a point embedding technology.
In some embodiments, identifying the authority resource code of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data includes: extracting the execution codes of all the service applications to obtain a characteristic data set corresponding to the execution codes of all the service applications; and identifying the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the characteristic data in each characteristic data set with the classification data of the database metadata aiming at the sensitive data.
In some embodiments, identifying the authority resource code of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data includes: and inputting the authority resource codes of each service application into the trained data identification model, and generating each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the data identification model is used for representing and judging whether the data represented by the authority resource codes have the sensitive data and the data type of the sensitive data.
In some embodiments, the method further comprises: and sending the link tracking identification to the client.
In some embodiments, the method further comprises: and optimizing the permission examination strategy based on the relevance of each sensitive data represented by each permission resource code and the permission examination.
According to a second aspect of the present application, there is provided an apparatus for identifying data, the apparatus comprising: the first acquisition unit is configured to respond to a received user access request sent by a client, and acquire a user identifier corresponding to the request, an access address and each service application provided for the request; the first generation unit is configured to generate a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application, wherein the link tracking identifier is used for representing an association relationship between the access address and each service application; a second obtaining unit configured to obtain application information of each service application from the database based on the link tracking identifier, wherein the application information includes: interface information and execution codes are obtained by updating the database in advance by using the link tracking identifier; the second generation unit is configured to generate an authority resource code of each service application corresponding to each interface information based on the user identifier and the interface information of each service application, wherein the authority resource code is used for representing resource information used for performing authority verification on the request; and the data identification unit is configured to identify the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the execution codes of each service application with the classification data aiming at the sensitive data in the metadata of the database.
In some embodiments, the update process of the database is accomplished by: the generation module is configured to bind the link tracking identifier with the acquired application information corresponding to each service application, and generate interface information corresponding to each bound service application and an execution code corresponding to each bound service application; and the updating module is configured to update the database based on the interface information corresponding to the bound service applications and the execution codes corresponding to the bound service applications.
In some embodiments, the generation module is further configured to bind the link trace identifier with the application information corresponding to each service application based on a link trace technique, wherein the link trace technique is used to characterize that the embedding is performed at a corresponding location of the application information corresponding to each service application by using an embedding technique.
In some embodiments, a data identification unit, comprising: the extraction module is configured to extract the execution codes of the service applications to obtain feature data sets corresponding to the execution codes of the service applications; and the identification module is configured to identify the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the characteristic data in each characteristic data set with the database metadata aiming at the classified data of the sensitive data.
In some embodiments, the data identification unit is further configured to input the authority resource codes of each service application into a trained data identification model, and generate each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data, where the data identification model is used for characterizing whether sensitive data and the data type of the sensitive data exist in the data represented by the authority resource codes.
In some embodiments, the apparatus further comprises: a sending unit configured to send the link trace identifier to the client.
In some embodiments, the apparatus further comprises: and the optimization unit is configured to optimize the permission examination strategy based on the correlation between each sensitive data represented by each permission resource code and the permission examination.
According to a third aspect of the present application, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.
According to a fourth aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions, wherein the computer instructions are for causing a computer to perform the method as described in any one of the implementations of the first aspect.
According to the technology of the application, a user access request sent by a client is responded to and analyzed, a user identifier corresponding to the request, an access address and each service application provided for the request are obtained, link tracking identifiers corresponding to the access identifier and each service application are generated based on the access address and each service application, and application information of each service application is obtained from a database based on the link tracking identifiers, wherein the application information comprises: interface information and execution codes, wherein the database is obtained by updating in advance through a link tracking identifier, authority resource codes of each service application corresponding to each interface information are generated based on a user identifier and the interface information of each service application, the authority resource codes of each service application are identified, each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data are obtained, wherein the identification is used for comparing the execution codes of each service application with classification data aiming at the sensitive data in metadata of the database, the problems that in the prior art, the authority resource codes represent sensitive data, an authority approver does not have an approval basis and authority management and examination are difficult are solved, and the problem that which backend application related to an access address in flow and the authority resource code corresponding to the backend application cannot be known through http message flow data is avoided, the asset positioning is difficult, and the key protection is difficult to achieve due to the fact that the position of sensitive data is fuzzy. By generating the link tracking identification and updating corresponding application information in the database, the data consanguinity relationship of the whole data chain from a user to a front-end application, the front-end application to a back-end application, a back-end service room and the back-end application to the database is realized by combining flow data perspective, and the data identification method for automatically identifying the sensitive data represented by the authority resource code and the sensitive data type according to the data consanguinity relationship is realized.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application.
FIG. 1 is a schematic diagram of a first embodiment of a method for identifying data according to the present application;
FIG. 2 is a diagram of a scenario in which a method for identifying data may implement an embodiment of the present application;
FIG. 3 is a schematic diagram of a second embodiment of a method for identifying data according to the present application;
FIG. 4 is a schematic block diagram illustrating one embodiment of an apparatus for identifying data according to the present application;
fig. 5 is a block diagram of an electronic device for implementing a method for identifying data according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows a schematic diagram 100 of a first embodiment of a method for identifying data according to the present application. The method for identifying data comprises the following steps:
In this embodiment, when an execution subject (for example, a rights management platform) receives a user access request sent by a client, the execution subject may analyze the user access request to obtain a user identifier, an access address, and each service application provided for the request corresponding to the request. The access address may be access information such as a URL. The application is used to characterize the service software provided for the request, and the application may be an application of each service provided for the request.
And 102, generating a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application.
In this embodiment, the execution subject may randomly generate a unique link tracking identifier corresponding to the access identifier and each service application according to the access address and each service application acquired in step 101. The link tracking identifier is used for characterizing the association relationship between the access address and each service application, namely, the client can see which backend service applications are called by the client through the link tracking identifier.
And 103, acquiring application information of each service application from the database based on the link tracking identifier.
In this embodiment, the executing agent may obtain the application information of each service application from the updated database based on the link tracking identifier generated in step 102. The application information at least includes: interface information and execution codes, and the database is obtained by updating in advance by using the link tracking identifier. The execution code may be SQL code.
Further, the updating process of the database is as follows: binding the link tracking identification with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application; and updating the database based on the interface information corresponding to each bound service application and the execution code corresponding to each bound service application. Corresponding application information is inquired through the link tracking identification, a data blood relationship applied to the database at the rear end is established, and the establishment of the data blood relationship of a complete data chain is realized.
And 104, generating authority resource codes of the service applications corresponding to the interface information based on the user identification and the interface information of the service applications.
In this embodiment, the execution main body may generate, according to the resource code generation rule, an authority resource code of each service application corresponding to each interface information according to the user identifier and the interface information of each service application acquired in step 101. The authority resource code is used for representing resource information used for performing authority verification on the request. The resource code associates the user with the backend service application through role information.
And 105, identifying the authority resource codes of the service applications to obtain each sensitive data represented by each authority resource code and the data type corresponding to each sensitive data.
In this embodiment, the executing agent may compare the execution code of each service application with the classification data for the sensitive data in the metadata of the database to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data. The sensitive data may include: the data type of the sensitive data can be obtained by classifying the sensitive data in various ways.
In some optional implementation manners of this embodiment, identifying the authority resource code of each service application to obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data includes: extracting the execution codes of all the service applications to obtain a characteristic data set corresponding to the execution codes of all the service applications; identifying authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the characteristic data in each characteristic data set with the classification data of the database metadata aiming at the sensitive data, and the characteristics can include: SQL various related information such as fields, tables, library information, etc. Accurate and comprehensive data identification is realized.
With continued reference to fig. 2, the method 200 for identifying data of the present embodiment operates in an electronic device 201. When the electronic device 201 receives a user access request sent by a client, the user access request is analyzed to obtain a user identifier, an access address and each service application 202 provided for the request, which correspond to the request, then the electronic device 201 generates a link tracking identifier 203 corresponding to the access identifier and each service application based on the access address and each service application, then the electronic device 201 obtains application information 204 of each service application from a database based on the link tracking identifier, the electronic device 201 generates an authority resource code 205 of each service application corresponding to each interface information based on the user identifier and interface information of each service application, and finally the electronic device 201 identifies the authority resource code of each service application to obtain each sensitive data represented by each authority resource code and a data type 206 corresponding to each sensitive data. The link tracking identifier is used for representing an association relation between the access address and each service application, and the application information comprises: the database is obtained by updating the link tracking identification in advance, and the execution code for representing each service application is compared with the classification data aiming at the sensitive data in the metadata of the database.
The method for identifying data provided by the above embodiment of the present application, in response to receiving a user access request sent by a client, analyzes the user access request to obtain a user identifier corresponding to the request, an access address, and each service application provided for the request, generates a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application, and obtains application information of each service application from a database based on the link tracking identifier, where the application information includes: interface information and execution codes, wherein the database is obtained by updating in advance through a link tracking identifier, authority resource codes of each service application corresponding to each interface information are generated based on a user identifier and the interface information of each service application, the authority resource codes of each service application are identified, each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data are obtained, wherein the identification is used for comparing the execution codes of each service application with classification data aiming at the sensitive data in metadata of the database, the problems that in the prior art, the authority resource codes represent sensitive data, an authority approver does not have an approval basis and authority management and examination are difficult are solved, and the problem that which backend application related to an access address in flow and the authority resource code corresponding to the backend application cannot be known through http message flow data is avoided, the asset positioning is difficult, and the key protection is difficult to achieve due to the fact that the position of sensitive data is fuzzy. By generating the link tracking identification and updating corresponding application information in the database, the data consanguinity relationship of the whole data chain from a user to a front-end application, the front-end application to a back-end application, a back-end service room and the back-end application to the database is realized by combining flow data perspective, and the data identification method for automatically identifying the sensitive data represented by the authority resource code and the sensitive data type according to the data consanguinity relationship is realized.
With further reference to fig. 3, a schematic diagram 300 of a second embodiment of a method for identifying data is shown. The process of the method comprises the following steps:
In this embodiment, the execution subject may randomly generate a unique link tracking identifier corresponding to the access identifier and each service application according to the access address and each service application acquired in step 301, and send the link tracking identifier to the client. The link tracking identifier is used for characterizing the association relationship between the access address and each service application, namely, the client can see which backend service applications are called by the client through the link tracking identifier.
And step 303, acquiring application information of each service application from the database based on the link tracking identifier.
In this embodiment, the executing agent may obtain the application information of each service application from the updated database based on the link tracking identifier generated in step 302. The application information at least includes: interface information and execution codes, and the database is obtained by updating in advance by using the link tracking identifier. The updating process of the database is as follows: binding the link tracking identification with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application; and updating the database based on the interface information corresponding to each bound service application and the execution code corresponding to each bound service application. Corresponding application information is inquired through the link tracking identification, a data blood relationship applied to the database at the rear end is established, and the establishment of the data blood relationship of a complete data chain is realized.
In some optional implementation manners of this embodiment, the binding the link tracking identifier with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application, includes: and based on a link tracking technology, binding the link tracking identification with the application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application, wherein the link tracking technology is used for representing that point embedding is carried out at the corresponding position of the application information corresponding to each service application by using a point embedding technology. The link tracing technique may be a SDK integrated Java Agent invoked link tracing technique. The method and the device realize simple and quick information binding.
And 304, generating authority resource codes of the service applications corresponding to the interface information based on the user identification and the interface information of the service applications.
In this embodiment, the execution subject may input the authority resource code of each service application to the trained data recognition model, and obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data by using a recognition method. The data identification model is used for representing whether sensitive data exist in the data represented by the authority resource codes and judging the data type of the sensitive data. The data recognition model is obtained by pre-training based on historical data. The data recognition model may be constructed based on a convolutional neural network.
In some optional implementations of this embodiment, the method further includes: and optimizing the permission examination strategy based on the relevance of each sensitive data represented by each permission resource code and the permission examination. And the data represented by the authority is used as the basis of the authority examination, so that better authority examination is realized, flow abnormity detection is further carried out on the URL related to sensitive data, and the false alarm rate is reduced.
It should be noted that the above training process of the structure and model of the convolutional neural network is a well-known technology which is widely researched and applied at present, and is not described herein again.
In this embodiment, the specific operations of steps 301 and 304 are substantially the same as the operations of steps 101 and 104 in the embodiment shown in fig. 1, and are not described again here.
As can be seen from fig. 3, compared with the embodiment corresponding to fig. 1, the schematic diagram 300 of the method for identifying data in this embodiment generates link tracking identifiers corresponding to the access identifiers and the service applications based on the access addresses and the service applications, and sends the link tracking identifiers to the client, so as to implement a data relationship of a full data chain from a user to a front-end application, a front-end application to a back-end application, a back-end service room, and a back-end application to a database by combining with a traffic data perspective; and inputting the authority resource codes of each service application into the trained data identification model, and generating each sensitive data represented by each authority resource code and the data type corresponding to each sensitive data, so that accurate and comprehensive data identification with wider application range is realized.
With further reference to fig. 4, as an implementation of the method shown in fig. 1 to 3, the present application provides an embodiment of an apparatus for identifying data, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 1, and the apparatus may be applied to various electronic devices.
As shown in fig. 4, the apparatus 400 for identifying data of the present embodiment includes: a first obtaining unit 401, a first generating unit 402, a second obtaining unit 403, a second generating unit 404, and a data identifying unit 405, where the first obtaining unit is configured to, in response to receiving a user access request sent by a client, obtain a user identifier corresponding to the request, an access address, and each service application provided for the request; the first generation unit is configured to generate a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application, wherein the link tracking identifier is used for representing an association relationship between the access address and each service application; a second obtaining unit configured to obtain application information of each service application from the database based on the link tracking identifier, wherein the application information includes: interface information and execution codes are obtained by updating the database in advance by using the link tracking identifier; the second generation unit is configured to generate an authority resource code of each service application corresponding to each interface information based on the user identifier and the interface information of each service application, wherein the authority resource code is used for representing resource information used for performing authority verification on the request; and the data identification unit is configured to identify the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the execution codes of each service application with the classification data aiming at the sensitive data in the metadata of the database.
In this embodiment, specific processes of the first obtaining unit 401, the first generating unit 402, the second obtaining unit 403, the second generating unit 404, and the data identifying unit 405 of the apparatus 400 for identifying data and technical effects thereof may respectively refer to the related descriptions of step 101 to step 105 in the embodiment corresponding to fig. 1, and are not repeated herein.
In some optional implementations of this embodiment, the update process of the database is completed through the following modules: the generation module is configured to bind the link tracking identifier with the acquired application information corresponding to each service application, and generate interface information corresponding to each bound service application and an execution code corresponding to each bound service application; and the updating module is configured to update the database based on the interface information corresponding to the bound service applications and the execution codes corresponding to the bound service applications.
In some optional implementations of this embodiment, the generation module is further configured to bind the link tracking identifier with the application information corresponding to each service application based on a link tracking technology, where the link tracking technology is used to characterize that the point burying is performed at a corresponding position of the application information corresponding to each service application by using a point burying technology.
In some optional implementations of this embodiment, the data identification unit includes: the extraction module is configured to extract the execution codes of the service applications to obtain feature data sets corresponding to the execution codes of the service applications; and the identification module is configured to identify the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the characteristic data in each characteristic data set with the database metadata aiming at the classified data of the sensitive data.
In some optional implementation manners of this embodiment, the data identification unit is further configured to input the authority resource code of each service application to the trained data identification model, and generate each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data, where the data identification model is used to characterize whether there is sensitive data and a data type of the sensitive data in the data represented by the authority resource code.
In some optional implementations of this embodiment, the apparatus further includes: a sending unit configured to send the link trace identifier to the client.
In some optional implementations of this embodiment, the apparatus further includes: and the optimization unit is configured to optimize the permission examination strategy based on the correlation between each sensitive data represented by each permission resource code and the permission examination.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 5, is a block diagram of an electronic device for a method of identifying data according to an embodiment of the application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
The memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for identifying data in the embodiment of the present application (for example, the first acquisition unit 401, the first generation unit 402, the second acquisition unit 403, the second generation unit 404, and the data identification unit 405 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing, i.e., implements the method for identifying data in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 502.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device for identifying data, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, which may be connected to an electronic device for identifying data over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the method for identifying data may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus for recognizing data, such as an input device such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the user access request is analyzed in response to the received user access request sent by the client, the user identification and the access address corresponding to the request and each service application provided for the request are obtained, the link tracking identification corresponding to the access identification and each service application is generated based on the access address and each service application, and the application information of each service application is obtained from the database based on the link tracking identification, wherein the application information comprises: interface information and execution codes, wherein the database is obtained by updating in advance through a link tracking identifier, authority resource codes of each service application corresponding to each interface information are generated based on a user identifier and the interface information of each service application, the authority resource codes of each service application are identified, each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data are obtained, wherein the identification is used for comparing the execution codes of each service application with classification data aiming at the sensitive data in metadata of the database, the problems that in the prior art, the authority resource codes represent sensitive data, an authority approver does not have an approval basis and authority management and examination are difficult are solved, and the problem that which backend application related to an access address in flow and the authority resource code corresponding to the backend application cannot be known through http message flow data is avoided, the asset positioning is difficult, and the key protection is difficult to achieve due to the fact that the position of sensitive data is fuzzy. By generating the link tracking identification and updating corresponding application information in the database, the data consanguinity relationship of the whole data chain from a user to a front-end application, the front-end application to a back-end application, a back-end service room and the back-end application to the database is realized by combining flow data perspective, and the data identification method for automatically identifying the sensitive data represented by the authority resource code and the sensitive data type according to the data consanguinity relationship is realized.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (16)
1. A method for identifying data, the method comprising:
responding to a received user access request sent by a client, analyzing the user access request, and obtaining the user identification, the access address and each service application provided for the request corresponding to the request;
generating a link tracking identifier corresponding to the access identifier and each service application based on the access address and each service application, wherein the link tracking identifier is used for representing an association relationship between the access address and each service application;
acquiring application information of each service application from a database based on the link tracking identifier, wherein the application information comprises: the database is obtained by updating in advance by using the link tracking identifier;
generating an authority resource code of each service application corresponding to each interface information based on the user identifier and the interface information of each service application, wherein the authority resource code is used for representing resource information used for performing authority verification on the request;
and identifying the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the execution codes of each service application with the classification data aiming at the sensitive data in the database metadata.
2. The method of claim 1, wherein the database update procedure is as follows:
binding the link tracking identifier with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application;
and updating the database based on the bound interface information corresponding to each service application and the bound execution code corresponding to each service application.
3. The method of claim 2, wherein the binding the link trace identifier with the obtained application information corresponding to each service application, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application includes:
and binding the link tracking identifier with the application information corresponding to each service application based on a link tracking technology, and generating interface information corresponding to each bound service application and an execution code corresponding to each bound service application, wherein the link tracking technology is used for representing that point embedding is performed at a corresponding position of the application information corresponding to each service application by using a point embedding technology.
4. The method of claim 1, wherein the identifying the authority resource code of each service application to obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data comprises:
extracting the execution codes of the service applications to obtain a feature data set corresponding to the execution codes of the service applications;
and identifying the authority resource codes of each service application to obtain each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the identification is used for comparing the characteristic data in each characteristic data set with the classification data of the database metadata aiming at the sensitive data.
5. The method of claim 1, wherein the identifying the authority resource code of each service application to obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data comprises:
and inputting the authority resource codes of each service application into a trained data identification model, and generating each sensitive data represented by each authority resource code and a data type corresponding to each sensitive data, wherein the data identification model is used for representing and judging whether sensitive data exist in the data represented by the authority resource codes and judging the data type of the sensitive data.
6. The method of claim 1, further comprising:
and sending the link tracking identifier to the client.
7. The method of claim 1, further comprising:
and optimizing the permission examination strategy based on the correlation between each sensitive data represented by each permission resource code and the permission examination.
8. An apparatus for identifying data, the apparatus comprising:
the first obtaining unit is configured to respond to a received user access request sent by a client, obtain the user identification, an access address and each service application provided for the request corresponding to the request;
a first generating unit configured to generate, based on the access address and the respective service applications, link tracking identifiers corresponding to the access identifiers and the respective service applications, wherein the link tracking identifiers are used for characterizing association relationships between the access address and the respective service applications;
a second obtaining unit, configured to obtain application information of the service applications from a database based on the link tracking identifier, wherein the application information includes: the database is obtained by updating in advance by using the link tracking identifier;
a second generating unit, configured to generate, based on the user identifier and interface information of each service application, an authority resource code of each service application corresponding to each interface information, where the authority resource code is used to represent resource information used for performing authority check on the request;
and the data identification unit is configured to identify the authority resource codes of each service application to obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data, wherein the identification is used for comparing the execution codes of each service application with classification data aiming at the sensitive data in the metadata of the database.
9. The apparatus of claim 8, wherein the database update process is accomplished by:
a generating module configured to bind the link tracking identifier with the acquired application information corresponding to each service application, and generate interface information corresponding to each bound service application and an execution code corresponding to each bound service application;
an updating module configured to update the database based on the bound interface information corresponding to each service application and the bound execution code corresponding to each service application.
10. The apparatus of claim 9, wherein the generating module is further configured to bind the link trace identifier with the application information corresponding to the respective service application based on a link tracing technique, wherein the link tracing technique is used to characterize that a landed point technique is used to perform a landed point at a corresponding location of the application information corresponding to the respective service application.
11. The apparatus of claim 8, wherein the data identification unit comprises:
the extraction module is configured to extract the execution codes of the service applications to obtain feature data sets corresponding to the execution codes of the service applications;
and the identification module is configured to identify the authority resource codes of the service applications to obtain each piece of sensitive data represented by each authority resource code and a data type corresponding to each piece of sensitive data, wherein the identification is used for comparing the feature data in each feature data set with the classification data of the database metadata for the sensitive data.
12. The apparatus of claim 8, wherein the data recognition unit is further configured to input the authority resource codes of the respective service applications into a trained data recognition model, and generate respective sensitive data characterized by the authority resource codes and a data type corresponding to the respective sensitive data, wherein the data recognition model is used for characterizing whether sensitive data and the data type of the sensitive data exist in the data characterized by the authority resource codes.
13. The apparatus of claim 8, further comprising:
a sending unit configured to send the link trace identifier to the client.
14. The apparatus of claim 8, further comprising:
and the optimization unit is configured to optimize the permission examination strategy based on the correlation between each piece of sensitive data represented by each permission resource code and the permission examination.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110180902.8A CN113779616B (en) | 2021-02-08 | 2021-02-08 | Method and device for identifying data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110180902.8A CN113779616B (en) | 2021-02-08 | 2021-02-08 | Method and device for identifying data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113779616A true CN113779616A (en) | 2021-12-10 |
CN113779616B CN113779616B (en) | 2024-04-05 |
Family
ID=78835697
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110180902.8A Active CN113779616B (en) | 2021-02-08 | 2021-02-08 | Method and device for identifying data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113779616B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114726596A (en) * | 2022-03-25 | 2022-07-08 | 北京沃东天骏信息技术有限公司 | Sensitive data processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103078859A (en) * | 2012-12-31 | 2013-05-01 | 普天新能源有限责任公司 | Service system authority management method, equipment and system |
CN110602046A (en) * | 2019-08-13 | 2019-12-20 | 上海陆家嘴国际金融资产交易市场股份有限公司 | Data monitoring processing method and device, computer equipment and storage medium |
US10515212B1 (en) * | 2016-06-22 | 2019-12-24 | Amazon Technologies, Inc. | Tracking sensitive data in a distributed computing environment |
CN111367983A (en) * | 2020-03-10 | 2020-07-03 | 中国联合网络通信集团有限公司 | Database access method, system, device and storage medium |
-
2021
- 2021-02-08 CN CN202110180902.8A patent/CN113779616B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103078859A (en) * | 2012-12-31 | 2013-05-01 | 普天新能源有限责任公司 | Service system authority management method, equipment and system |
US10515212B1 (en) * | 2016-06-22 | 2019-12-24 | Amazon Technologies, Inc. | Tracking sensitive data in a distributed computing environment |
CN110602046A (en) * | 2019-08-13 | 2019-12-20 | 上海陆家嘴国际金融资产交易市场股份有限公司 | Data monitoring processing method and device, computer equipment and storage medium |
CN111367983A (en) * | 2020-03-10 | 2020-07-03 | 中国联合网络通信集团有限公司 | Database access method, system, device and storage medium |
Non-Patent Citations (1)
Title |
---|
陈士沣;廖泰安;: "基于动态追踪机制的物资信息服务系统", 计算机工程与设计, no. 03, 16 March 2011 (2011-03-16) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114726596A (en) * | 2022-03-25 | 2022-07-08 | 北京沃东天骏信息技术有限公司 | Sensitive data processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN113779616B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20180349257A1 (en) | Systems and methods for test prediction in continuous integration environments | |
CN110543506B (en) | Data analysis method and device, electronic equipment and storage medium | |
US11609804B2 (en) | Flexible event ingestion framework in an event processing system | |
CN113867913B (en) | Micro-service-oriented business request processing method, device, equipment and storage medium | |
CN112104734B (en) | Method, device, equipment and storage medium for pushing information | |
CN111639027B (en) | Test method and device and electronic equipment | |
CN111752843A (en) | Method, device, electronic equipment and readable storage medium for determining influence surface | |
CN111582477A (en) | Training method and device of neural network model | |
CN110619002A (en) | Data processing method, device and storage medium | |
CN113342946B (en) | Model training method and device for customer service robot, electronic equipment and medium | |
CN111930346B (en) | Artificial intelligence information processing method and device, electronic equipment and storage medium | |
CN112487973A (en) | User image recognition model updating method and device | |
CN111783427B (en) | Method, device, equipment and storage medium for training model and outputting information | |
CN111581518A (en) | Information pushing method and device | |
WO2022100075A1 (en) | Method and apparatus for performance test, electronic device and computer-readable medium | |
CN113779616A (en) | Method and apparatus for identifying data | |
CN111753330B (en) | Determination method, apparatus, device and readable storage medium for data leakage main body | |
CN111832070B (en) | Data masking method, device, electronic equipment and storage medium | |
CN112382292A (en) | Voice-based control method and device | |
US20210168105A1 (en) | Message normalization engine for multi-cloud messaging systems | |
US11838294B2 (en) | Method for identifying user, storage medium, and electronic device | |
CN111177352B (en) | Complaint information processing method and device, electronic equipment and readable storage medium | |
CN113903033A (en) | Identification and processing method and device and intelligent detection system | |
CN112822302B (en) | Data normalization method and device, electronic equipment and storage medium | |
CN112307372A (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |