CN114679301B - Method and system for accessing data of data lake by utilizing safe sandbox - Google Patents

Method and system for accessing data of data lake by utilizing safe sandbox Download PDF

Info

Publication number
CN114679301B
CN114679301B CN202210195287.2A CN202210195287A CN114679301B CN 114679301 B CN114679301 B CN 114679301B CN 202210195287 A CN202210195287 A CN 202210195287A CN 114679301 B CN114679301 B CN 114679301B
Authority
CN
China
Prior art keywords
data
module
storage
request
lake
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210195287.2A
Other languages
Chinese (zh)
Other versions
CN114679301A (en
Inventor
谢少飞
董晓斌
喻波
王志海
安鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wondersoft Technology Co Ltd
Original Assignee
Beijing Wondersoft Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wondersoft Technology Co Ltd filed Critical Beijing Wondersoft Technology Co Ltd
Priority to CN202210195287.2A priority Critical patent/CN114679301B/en
Publication of CN114679301A publication Critical patent/CN114679301A/en
Application granted granted Critical
Publication of CN114679301B publication Critical patent/CN114679301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0807Network architectures or network communication protocols for network security for authentication of entities using tickets, e.g. Kerberos

Abstract

The application provides a method for accessing data of a data lake by utilizing a safe sandbox. The security sandbox is used for realizing the security storage and the security access of the data in the data lake, and comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data grounding module and a data interface module.

Description

Method and system for accessing data of data lake by utilizing safe sandbox
Technical Field
The application belongs to the field of data storage and access, and particularly relates to a method and a system for accessing data of a data lake by utilizing a safe sandbox.
Background
The data lake is oriented to information storage of multiple data sources, wherein structured data and unstructured data can be stored; the data stored in the data lake, in particular unstructured data, such as text files, pictures, video, audio, etc., are stored in an easy-to-read manner. If the security of one repository is compromised, then the unknown party may access all of the data lakes. The data in the database is accessed through the interface and is provided to the requesting party by way of data delivery. If the corresponding access agent does not control the data acquisition mechanism in the scene, the illegal access request may acquire all data.
Disclosure of Invention
In order to solve the technical problems, the application provides a scheme for accessing data of a data lake by utilizing a safe sandbox.
The first aspect of the application discloses a method for accessing data in a data lake by using a secure sandbox. The security sandbox is used for realizing the security storage and the security access of the data in the data lake, and comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data grounding module and a data interface module. The method specifically comprises the following steps:
storing data to the data lake:
after receiving a storage request sent by a requester, the security sandbox executes authentication processing on the storage request by the request authentication module in the access agent module so as to judge whether the storage request is legal or not;
if yes, the storage request is sent to the data storage module to acquire the data to be stored, the format conversion module in the data storage module performs format conversion on the data to be stored, and the data to be stored after format conversion is stored to the data lake by the data landing module in the data storage module;
if not, prohibiting the storage process of the data to be stored, and returning a message that authentication does not pass to the requesting party;
when data is acquired from the data lake:
after the secure sandbox receives the acquisition request sent by the requester, the request authentication module in the access agent module executes authentication processing on the acquisition request to judge whether the acquisition request is legal or not;
if yes, the acquisition request is sent to the parameter control module to acquire a data range which can be accessed by the requester in the data lake, the data receipt module acquires the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and returning a message that authentication is not passed to the requester.
According to the method of the first aspect of the present application, the request authentication module performs authentication processing on the storage request specifically includes:
checking whether a storage application side identifier in the storage request exists:
if not, returning a message that the authentication does not pass;
if yes, checking the storage application part representation, and judging whether the storage application part is an allowed application or not:
if not, returning a message that the authentication does not pass;
if yes, further checking whether the token information in the storage request is legal or not through checking the storage application side identifier:
if not, returning a message that the authentication does not pass;
if yes, passing the authentication processing.
According to the method of the first aspect of the present application, the format conversion module performs format conversion on the data to be stored specifically includes:
invoking a structured data component and an unstructured data component in the format conversion module;
when the data to be stored is structured data, converting the data to be stored into a structured mode contained in the structured data component by utilizing the structured data component, and carrying out encryption storage, and simultaneously storing the identification of the storage application party, the time and the path of the encryption storage;
when the data to be stored is unstructured data, converting the data to be stored into an unstructured mode contained in the unstructured data module by utilizing the unstructured data module, and conducting encryption storage, and meanwhile storing the storage application side identification, the encryption storage time and the encryption storage path.
According to the method of the first aspect of the present application, the acquisition request includes a request application identifier, an authentication token, a data type to be acquired and a data retrieval condition, wherein:
the request application identifier and the authentication token are used for authentication processing of the acquisition request, and the request application identifier is used for determining a data range which can be accessed by the request party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
According to the method of the first aspect of the application, after the parameter control module acquires the accessible data range of the requester in the data lake, the parameter control module judges whether the data to be acquired by the requester is in the accessible data range and sends the judging result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
In a second aspect, the application discloses a system for accessing data in a data lake using a secure sandbox. The security sandbox is used for realizing the security storage and the security access of the data in the data lake, and comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data grounding module and a data interface module. The system specifically comprises:
a first processing unit configured to, when storing data to the data lake:
after detecting that the security sandbox receives a storage request sent by a requester, calling the request authentication module in the access agent module to execute authentication processing on the storage request so as to judge whether the storage request is legal or not;
if yes, the request authentication module is called to send the storage request to the data storage module to acquire the data to be stored, the format conversion module in the data storage module is called to convert the format of the data to be stored, and the data grounding module in the data storage module is called to store the data to be stored after the format conversion to the data lake;
if not, prohibiting a storage process of the data to be stored, and calling the request authentication module to return a message that authentication does not pass to the requesting party;
a second processing unit configured to, when data is acquired from the data lake:
after detecting that the secure sandbox receives an acquisition request sent by the requester, invoking the request authentication module in the access agent module to execute authentication processing on the acquisition request so as to judge whether the acquisition request is legal or not;
if yes, the request authentication module is called to send the acquisition request to the parameter control module so as to acquire a data range which can be accessed by the requester in the data lake, the data receipt module is called to acquire the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and calling the request authentication module to return a message that authentication is not passed to the requester.
According to the system of the second aspect of the present application, the request authentication module performs authentication processing on the storage request, specifically including:
checking whether a storage application side identifier in the storage request exists:
if not, returning a message that the authentication does not pass;
if yes, checking the storage application part representation, and judging whether the storage application part is an allowed application or not:
if not, returning a message that the authentication does not pass;
if yes, further checking whether the token information in the storage request is legal or not through checking the storage application side identifier:
if not, returning a message that the authentication does not pass;
if yes, passing the authentication processing.
According to the system of the second aspect of the present application, the format conversion module performs format conversion on the data to be stored specifically includes:
invoking a structured data component and an unstructured data component in the format conversion module;
when the data to be stored is structured data, converting the data to be stored into a structured mode contained in the structured data component by utilizing the structured data component, and carrying out encryption storage, and simultaneously storing the identification of the storage application party, the time and the path of the encryption storage;
when the data to be stored is unstructured data, converting the data to be stored into an unstructured mode contained in the unstructured data module by utilizing the unstructured data module, and conducting encryption storage, and meanwhile storing the storage application side identification, the encryption storage time and the encryption storage path.
The system according to the second aspect of the present application, the acquisition request includes a request application identifier, an authentication token, a data type to be acquired, and a data retrieval condition, wherein:
the request application identifier and the authentication token are used for authentication processing of the acquisition request, and the request application identifier is used for determining a data range which can be accessed by the request party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
According to the system of the second aspect of the present application, the second processing unit is specifically configured to:
after the parameter control module obtains the accessible data range of the requester in the data lake, judging whether the data to be obtained by the requester is in the accessible data range or not, and sending a judging result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
A third aspect of the application discloses an electronic device. The electronic device comprises a memory storing a computer program and a processor implementing the steps of a method for accessing data in a data lake using a secure sandbox according to any one of the first aspects of the present disclosure when the computer program is executed.
A fourth aspect of the application discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a method for accessing data of a data lake using a secure sandbox as described in any of the first aspects of the present disclosure.
In summary, the technical scheme provided by the application ensures the safety of the data stored in the data lake based on the combination of the sandbox and the access agent, specifically, the data request is checked and authenticated by the sandbox controlling the file form of the data finally stored in the landing mode, and the safety solution for the data of the data lake is provided as a whole by the access agent.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings which are required in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are some embodiments of the application and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a secure sandbox in accordance with an embodiment of the present application;
fig. 2 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The first aspect of the application discloses a method for accessing data in a data lake by using a secure sandbox. FIG. 1 is a schematic diagram of a secure sandbox in accordance with an embodiment of the present application; as shown in fig. 1, the secure sandbox is used for realizing secure storage and secure access to data in the data lake, and the secure sandbox comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data grounding module and a data interface module.
The method specifically comprises the following steps:
storing data to the data lake:
after receiving a storage request sent by a requester, the security sandbox executes authentication processing on the storage request by the request authentication module in the access agent module so as to judge whether the storage request is legal or not;
if yes, the storage request is sent to the data storage module to acquire the data to be stored, the format conversion module in the data storage module performs format conversion on the data to be stored, and the data to be stored after format conversion is stored to the data lake by the data landing module in the data storage module;
if not, prohibiting the storage process of the data to be stored, and returning a message that the authentication is not passed to the requesting party.
The method specifically comprises the following steps:
when data is acquired from the data lake:
after the secure sandbox receives the acquisition request sent by the requester, the request authentication module in the access agent module executes authentication processing on the acquisition request to judge whether the acquisition request is legal or not;
if yes, the acquisition request is sent to the parameter control module to acquire a data range which can be accessed by the requester in the data lake, the data receipt module acquires the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and returning a message that authentication is not passed to the requester.
In some embodiments, the request authentication module performs authentication processing of the storage request specifically includes:
checking whether a storage application side identifier in the storage request exists:
if not, returning a message that the authentication does not pass;
if yes, checking the storage application part representation, and judging whether the storage application part is an allowed application or not:
if not, returning a message that the authentication does not pass;
if yes, further checking whether the token information in the storage request is legal or not through checking the storage application side identifier:
if not, returning a message that the authentication does not pass;
if yes, passing the authentication processing.
In some embodiments, the format conversion module performs format conversion on the data to be stored specifically includes:
invoking a structured data component and an unstructured data component in the format conversion module;
when the data to be stored is structured data, converting the data to be stored into a structured mode contained in the structured data component by utilizing the structured data component, and carrying out encryption storage, and simultaneously storing the identification of the storage application party, the time and the path of the encryption storage;
when the data to be stored is unstructured data, converting the data to be stored into an unstructured mode contained in the unstructured data module by utilizing the unstructured data module, and conducting encryption storage, and meanwhile storing the storage application side identification, the encryption storage time and the encryption storage path.
In some embodiments, the acquisition request includes a request application identifier, an authentication token, a data type to be acquired, and a data retrieval condition, wherein:
the request application identifier and the authentication token are used for authentication processing of the acquisition request, and the request application identifier is used for determining a data range which can be accessed by the request party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
In some embodiments, the parameter control module determines whether the data to be acquired by the requester is in the accessible data range after acquiring the accessible data range of the requester in the data lake, and sends the determination result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
Specifically, after the storage request reaches the security sandbox, the request is checked first, and whether the current request is legal or not is judged. And if the authentication fails, prohibiting storage of the stored data. And after the authentication is successful, forwarding the request to format conversion, and converting the format according to the sandbox data format. After format conversion, the data is stored in a sandbox mode, and the storage mode is not readable, so that data leakage is avoided.
Specifically, after the data request reaches the security sandbox, authentication judgment is performed on the request, and whether the current request has the authority of the data request or not is judged. And if the access request is not authorized, prohibiting the access of the data from the current request. If the permission exists, the parameters of the data requested by the permission are checked at the same time, the range of the data which can be accessed by the current request is judged, and only the data in the permission range is allowed to be accessed. And accessing a unified interface of the stored data through the data receipt interface, and returning the data according to the data standard of the sandbox.
Specifically, the security sandbox also comprises corresponding functions of authority management, authorization tracking, access audit and the like, and an administrator can execute data storage, request authority control and the like and conduct detailed recording on the operation record. The method specifically comprises the following steps:
(1) And the permission management module is used for:
the administrator is configured in the system to register the data storage request application and the data acquisition application, and configures basic information such as application names, identifiers, manufacturers and the like. The administrator generates unique application identifications for the different applications. The administrator enables the application allowing subsequent applications to send corresponding requests to the current system. If the application needs to be logged off, the administrator logs off the application in the application management, and does not subsequently receive any data request of the current application.
(2) An authorization tracking module:
the administrator issues an authentication token for the enabled application. An administrator configures a request scope for currently available data (including unstructured data as well as structured data) for a data acquirer.
(3) And (3) an access audit module:
after the data request arrives at the system, the system records the identification and the request time of the current application. Judging whether the data request is a stored data request or not and acquiring the data request currently; if the data request is a storage data request, the current data packet size is recorded. If the request is a request for acquiring data, recording the request parameters in the current data. The system identifies basic information of the associated application through the application, and stores the data in a floor mode. The administrator can audit the requests for storage/access in detail after logging in.
In a second aspect, the application discloses a system for accessing data in a data lake using a secure sandbox. The security sandbox is used for realizing the security storage and the security access of the data in the data lake, and comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data grounding module and a data interface module.
The system specifically comprises:
a first processing unit configured to, when storing data to the data lake:
after detecting that the security sandbox receives a storage request sent by a requester, calling the request authentication module in the access agent module to execute authentication processing on the storage request so as to judge whether the storage request is legal or not;
if yes, the request authentication module is called to send the storage request to the data storage module to acquire the data to be stored, the format conversion module in the data storage module is called to convert the format of the data to be stored, and the data grounding module in the data storage module is called to store the data to be stored after the format conversion to the data lake;
if not, prohibiting a storage process of the data to be stored, and calling the request authentication module to return a message that authentication does not pass to the requesting party;
a second processing unit configured to, when data is acquired from the data lake:
after detecting that the secure sandbox receives an acquisition request sent by the requester, invoking the request authentication module in the access agent module to execute authentication processing on the acquisition request so as to judge whether the acquisition request is legal or not;
if yes, the request authentication module is called to send the acquisition request to the parameter control module so as to acquire a data range which can be accessed by the requester in the data lake, the data receipt module is called to acquire the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and calling the request authentication module to return a message that authentication is not passed to the requester.
According to the system of the second aspect of the present application, the request authentication module performs authentication processing on the storage request, specifically including:
checking whether a storage application side identifier in the storage request exists:
if not, returning a message that the authentication does not pass;
if yes, checking the storage application part representation, and judging whether the storage application part is an allowed application or not:
if not, returning a message that the authentication does not pass;
if yes, further checking whether the token information in the storage request is legal or not through checking the storage application side identifier:
if not, returning a message that the authentication does not pass;
if yes, passing the authentication processing.
According to the system of the second aspect of the present application, the format conversion module performs format conversion on the data to be stored specifically includes:
invoking a structured data component and an unstructured data component in the format conversion module;
when the data to be stored is structured data, converting the data to be stored into a structured mode contained in the structured data component by utilizing the structured data component, and carrying out encryption storage, and simultaneously storing the identification of the storage application party, the time and the path of the encryption storage;
when the data to be stored is unstructured data, converting the data to be stored into an unstructured mode contained in the unstructured data module by utilizing the unstructured data module, and conducting encryption storage, and meanwhile storing the storage application side identification, the encryption storage time and the encryption storage path.
The system according to the second aspect of the present application, the acquisition request includes a request application identifier, an authentication token, a data type to be acquired, and a data retrieval condition, wherein:
the request application identifier and the authentication token are used for authentication processing of the acquisition request, and the request application identifier is used for determining a data range which can be accessed by the request party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
According to the system of the second aspect of the present application, the second processing unit is specifically configured to:
after the parameter control module obtains the accessible data range of the requester in the data lake, judging whether the data to be obtained by the requester is in the accessible data range or not, and sending a judging result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
A third aspect of the application discloses an electronic device. The electronic device comprises a memory storing a computer program and a processor implementing the steps of a method for accessing data in a data lake using a secure sandbox according to any one of the first aspects of the present disclosure when the computer program is executed.
Fig. 2 is a block diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 2, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 2 is merely a block diagram of a portion related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the technical solution of the present disclosure is applied, and that a specific electronic device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
A fourth aspect of the application discloses a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a method for accessing data of a data lake using a secure sandbox as described in any of the first aspects of the present disclosure.
In summary, the technical scheme provided by the application ensures the safety of the data stored in the data lake based on the combination of the sandbox and the access agent, specifically, the data request is checked and authenticated by the sandbox controlling the file form of the data finally stored in the landing mode, and the safety solution for the data of the data lake is provided as a whole by the access agent.
Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (10)

1. A method for accessing data in a data lake by utilizing a secure sandbox, which is characterized in that the secure sandbox is used for realizing secure storage and secure access of the data in the data lake, and comprises an access agent module, a data storage module, a right management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data falling module and a data interface module; the method specifically comprises the following steps:
storing data to the data lake:
after receiving a storage request sent by a requester, the security sandbox executes authentication processing on the storage request by the request authentication module in the access agent module so as to judge whether the storage request is legal or not;
if yes, the storage request is sent to the data storage module to acquire data to be stored, the format conversion module in the data storage module performs format conversion on the data to be stored, and the data to be stored after format conversion is stored to the data lake by the data landing module in the data storage module;
if not, prohibiting the storage process of the data to be stored, and returning a message that authentication does not pass to the requesting party;
when data is acquired from the data lake:
after the secure sandbox receives the acquisition request sent by the requester, the request authentication module in the access agent module executes authentication processing on the acquisition request to judge whether the acquisition request is legal or not;
if yes, the acquisition request is sent to the parameter control module to acquire a data range which can be accessed by the requester in the data lake, the data receipt module acquires the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and returning a message that authentication is not passed to the requester.
2. The method for accessing data in a data lake by using a secure sandbox according to claim 1, wherein the request authentication module performs an authentication process of the storage request, specifically comprising:
checking whether a storage application side identifier in the storage request exists:
if not, returning a message that the authentication does not pass;
if yes, checking the storage application party identifier, and judging whether the storage application party is an allowed application or not:
if not, returning a message that the authentication does not pass;
if yes, further checking whether the token information in the storage request is legal or not through checking the storage application side identifier:
if not, returning a message that the authentication does not pass;
if yes, passing the authentication processing.
3. The method for accessing data in a data lake by using a secure sandbox according to claim 2, wherein the format conversion module performs format conversion on the data to be stored specifically includes:
invoking a structured data component and an unstructured data component in the format conversion module;
when the data to be stored is structured data, converting the data to be stored into a structured mode contained in the structured data component by utilizing the structured data component, and carrying out encryption storage, and simultaneously storing the identification of the storage application party, the time and the path of the encryption storage;
when the data to be stored is unstructured data, converting the data to be stored into an unstructured mode contained in the unstructured data module by utilizing the unstructured data module, and conducting encryption storage, and meanwhile storing the storage application side identification, the encryption storage time and the encryption storage path.
4. A method for accessing data in a data lake using a secure sandbox according to claim 3 wherein the request for acquisition comprises a request party identification, an authentication token, a type of data to be acquired and data retrieval conditions, wherein:
the requesting application party identifier and the authentication token are used for authentication processing of the acquisition request, and the requesting application party identifier is used for determining a data range which can be accessed by the requesting party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
5. The method for accessing data in a data lake by using a secure sandbox according to claim 4, wherein the parameter control module determines whether the data to be acquired by the requester is in the accessible data range after acquiring the accessible data range of the requester in the data lake, and sends the determination result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
6. A system for accessing data in a data lake by utilizing a secure sandbox, wherein the secure sandbox is used for realizing secure storage and secure access of the data in the data lake, and comprises an access agent module, a data storage module, a permission management module, an authorization tracking module and an access audit module, wherein the access agent module comprises a request authentication module, a parameter control module and a data receipt module, and the data storage module comprises a format conversion module, a data falling module and a data interface module; the system specifically comprises:
a first processing unit configured to, when storing data to the data lake:
after detecting that the security sandbox receives a storage request sent by a requester, calling the request authentication module in the access agent module to execute authentication processing on the storage request so as to judge whether the storage request is legal or not;
if yes, the request authentication module is called to send the storage request to the data storage module to acquire data to be stored, the format conversion module in the data storage module is called to convert the format of the data to be stored, and the data grounding module in the data storage module is called to store the data to be stored after the format conversion to the data lake;
if not, prohibiting a storage process of the data to be stored, and calling the request authentication module to return a message that authentication does not pass to the requesting party;
a second processing unit configured to, when data is acquired from the data lake:
after detecting that the secure sandbox receives an acquisition request sent by the requester, invoking the request authentication module in the access agent module to execute authentication processing on the acquisition request so as to judge whether the acquisition request is legal or not;
if yes, the request authentication module is called to send the acquisition request to the parameter control module so as to acquire a data range which can be accessed by the requester in the data lake, the data receipt module is called to acquire the data which is requested by the requester and is in the data range through the data interface module in the data storage module, and the acquired data is sent back to the requester;
if not, prohibiting the requester from accessing the data in the data lake, and calling the request authentication module to return a message that authentication is not passed to the requester.
7. The system for accessing data in a data lake using a secure sandbox of claim 6, wherein the request for acquisition comprises a request party identification, an authentication token, a type of data to be acquired, and a data retrieval condition, wherein:
the requesting application party identifier and the authentication token are used for authentication processing of the acquisition request, and the requesting application party identifier is used for determining a data range which can be accessed by the requesting party in the data lake;
the data type to be acquired is used for acquiring structured data or unstructured data corresponding to the data type from the data lake;
the data retrieval conditions are used for carrying out conditional query on the data to be acquired so as to acquire the data corresponding to the query result.
8. The system for accessing data in a data lake using a secure sandbox of claim 7, wherein the second processing unit is specifically configured to:
after the parameter control module obtains the accessible data range of the requester in the data lake, judging whether the data to be obtained by the requester is in the accessible data range or not, and sending a judging result to the data receipt module; wherein:
when the judgment result is yes, extracting corresponding data from the data lake through the data interface module, converting the corresponding data from a format conforming to the storage rule of the safe sandbox into an original format through format conversion, and returning the corresponding data to the requester from the data receipt module as the acquired data;
and when the judging result is negative, the data receipt module sends a message exceeding the access range to the requesting party.
9. An electronic device comprising a memory storing a computer program and a processor implementing the steps of a method for accessing data in a data lake using a secure sandbox as claimed in any one of claims 1 to 5 when the computer program is executed by the processor.
10. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of a method for accessing data of a data lake using a secure sandbox according to any one of claims 1 to 5.
CN202210195287.2A 2022-03-01 2022-03-01 Method and system for accessing data of data lake by utilizing safe sandbox Active CN114679301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210195287.2A CN114679301B (en) 2022-03-01 2022-03-01 Method and system for accessing data of data lake by utilizing safe sandbox

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210195287.2A CN114679301B (en) 2022-03-01 2022-03-01 Method and system for accessing data of data lake by utilizing safe sandbox

Publications (2)

Publication Number Publication Date
CN114679301A CN114679301A (en) 2022-06-28
CN114679301B true CN114679301B (en) 2023-10-20

Family

ID=82071580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210195287.2A Active CN114679301B (en) 2022-03-01 2022-03-01 Method and system for accessing data of data lake by utilizing safe sandbox

Country Status (1)

Country Link
CN (1) CN114679301B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106603727A (en) * 2017-02-08 2017-04-26 中国信息安全研究院有限公司 Method and device for integrating and accessing system data
CN108449369A (en) * 2018-07-23 2018-08-24 常州天正工业发展股份有限公司 A kind of data authentication network, aggregation gateway and the Business Logic network architecture
CN109726593A (en) * 2018-12-31 2019-05-07 联动优势科技有限公司 A kind of implementation method and device of data sandbox
CN111221887A (en) * 2018-11-27 2020-06-02 中云开源数据技术(上海)有限公司 Method for managing and accessing data in data lake server
CN113536327A (en) * 2020-04-20 2021-10-22 北京沃东天骏信息技术有限公司 Data processing method, device and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3050220A1 (en) * 2018-07-19 2020-01-19 Bank Of Montreal Systems and methods for data storage and processing
US20200193057A1 (en) * 2018-12-13 2020-06-18 Amaris.Ai Pte. Ltd. Privacy enhanced data lake for a total customer view
US20200372531A1 (en) * 2019-05-23 2020-11-26 Capital One Services, Llc System and method for providing consistent pricing information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106603727A (en) * 2017-02-08 2017-04-26 中国信息安全研究院有限公司 Method and device for integrating and accessing system data
CN108449369A (en) * 2018-07-23 2018-08-24 常州天正工业发展股份有限公司 A kind of data authentication network, aggregation gateway and the Business Logic network architecture
CN111221887A (en) * 2018-11-27 2020-06-02 中云开源数据技术(上海)有限公司 Method for managing and accessing data in data lake server
CN109726593A (en) * 2018-12-31 2019-05-07 联动优势科技有限公司 A kind of implementation method and device of data sandbox
CN113536327A (en) * 2020-04-20 2021-10-22 北京沃东天骏信息技术有限公司 Data processing method, device and system

Also Published As

Publication number Publication date
CN114679301A (en) 2022-06-28

Similar Documents

Publication Publication Date Title
CN108810006B (en) Resource access method, device, equipment and storage medium
US10681028B2 (en) Controlling access to resources on a network
US11736292B2 (en) Access token management method, terminal, and server
CN108923908B (en) Authorization processing method, device, equipment and storage medium
CN110414268B (en) Access control method, device, equipment and storage medium
US9686287B2 (en) Delegating authorization to applications on a client device in a networked environment
US9769266B2 (en) Controlling access to resources on a network
CN108632253B (en) Client data security access method and device based on mobile terminal
US11290446B2 (en) Access to data stored in a cloud
CN111143816B (en) Verification and authorization method and verification server
US20140109194A1 (en) Authentication Delegation
CN110324416B (en) Download path tracking method, device, server, terminal and medium
CN112131021A (en) Access request processing method and device
CN112039878B (en) Equipment registration method and device, computer equipment and storage medium
CN111177741A (en) Pre-authorization data access method and device based on enterprise browser
US20150020167A1 (en) System and method for managing files
CN116956308A (en) Database processing method, device, equipment and medium
CN114679301B (en) Method and system for accessing data of data lake by utilizing safe sandbox
CN109784073A (en) Data access method and device, storage medium, computer equipment
CN109948362B (en) Data access processing method and system
US8627072B1 (en) Method and system for controlling access to data
CN117010020A (en) Database processing method, device, equipment and medium
CN114662130A (en) Enterprise terminal computer data security encryption method
CN115987558A (en) Request processing method and system
CN113765673A (en) Access control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant