WO2020000716A1 - Système d'analyse de mégadonnées, serveur, procédé de traitement de données, programme et support de stockage - Google Patents

Système d'analyse de mégadonnées, serveur, procédé de traitement de données, programme et support de stockage Download PDF

Info

Publication number
WO2020000716A1
WO2020000716A1 PCT/CN2018/107487 CN2018107487W WO2020000716A1 WO 2020000716 A1 WO2020000716 A1 WO 2020000716A1 CN 2018107487 W CN2018107487 W CN 2018107487W WO 2020000716 A1 WO2020000716 A1 WO 2020000716A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
data processing
user
called
processing engine
Prior art date
Application number
PCT/CN2018/107487
Other languages
English (en)
Chinese (zh)
Inventor
蒋英明
冯朝阁
贺波
邓杰
唐浚洲
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020000716A1 publication Critical patent/WO2020000716A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of computer technology, and in particular, to a big data analysis system, a server, a data processing method, a program, and a storage medium.
  • the storage engine is used for distributed storage of data, which has good scalability and high fault tolerance, and uses the calculation engine for parallel computing, which improves the calculation speed and performance.
  • Each storage engine or computing engine provides a separate access portal.
  • it often requires Accessing the entrances of multiple storage engines or computing engines one by one is cumbersome.
  • the main purpose of this application is to provide a big data analysis system, server, data processing method, program and storage medium, which aims to solve how to build a big data analysis platform that integrates multiple computing engines and storage engines and provide users with a unified Problems with interactive portals.
  • the big data analysis system includes a client, a server, and multiple data processing engines.
  • the server communicates with the client and each of the data processing engines. Connection, where:
  • the client is configured to send an operation request carrying preset type information to the server, and receive response result data for the operation request returned by the server;
  • the server includes a memory and a processor, and a data processing program is stored on the memory.
  • a data processing program is stored on the memory.
  • Receiving step receiving an operation request sent by the client and carrying preset type information, where the preset type information includes at least one piece of information;
  • Reading step reading the information fragment in the preset type information, and identifying whether the information fragment contains call identification information;
  • First execution step when the call identification information is identified in an information fragment, sending the information fragment to a corresponding data processing engine to be called for data processing, and obtaining a first response result corresponding to the information fragment data;
  • Second execution step when the call identification information is not recognized in an information segment, perform an operation corresponding to the operation request based on the information segment, and use the operation result as a second response result corresponding to the information segment data;
  • a feedback step returning the first response result data and / or the second response result data corresponding to all pieces of information in the preset type information to the client as the response result data corresponding to the operation request;
  • the data processing engine is configured to receive the information fragment sent by the server, perform data processing according to the information fragment, and return a data processing result to the server.
  • the present application also proposes a data processing method, which includes:
  • Receiving step receiving an operation request sent by a client and carrying preset type information, where the preset type information includes at least one piece of information;
  • Reading step reading the information fragment in the preset type information, and identifying whether the information fragment contains call identification information;
  • First execution step when the call identification information is identified in an information fragment, sending the information fragment to a corresponding data processing engine to be called for data processing, and obtaining a first response result corresponding to the information fragment data;
  • Second execution step when the call identification information is not recognized in an information segment, perform an operation corresponding to the operation request based on the information segment, and use the operation result as a second response result corresponding to the information segment data;
  • a feedback step returning first response result data and / or second response result data corresponding to all pieces of information in the preset type information to the client as response result data corresponding to the operation request.
  • the present application also proposes a server.
  • the server includes a memory and a processor, and the memory stores a data processing program.
  • Receiving step receiving an operation request sent by the client and carrying preset type information, where the preset type information includes at least one piece of information;
  • Reading step reading the information fragment in the preset type information, and identifying whether the information fragment contains call identification information;
  • First execution step when the call identification information is identified in an information fragment, sending the information fragment to a corresponding data processing engine to be called for data processing, and obtaining a first response result corresponding to the information fragment data;
  • Second execution step when the call identification information is not recognized in an information segment, perform an operation corresponding to the operation request based on the information segment, and use the operation result as a second response result corresponding to the information segment data;
  • a feedback step returning first response result data and / or second response result data corresponding to all pieces of information in the preset type information to the client as response result data corresponding to the operation request.
  • the present application also proposes a data processing program, the data processing program includes: a receiving module, configured to receive an operation request sent by the client and carrying preset type information, the preset type The information includes at least one piece of information;
  • a reading module configured to read an information piece in the preset type of information and identify whether the information piece contains call identification information
  • a first execution module configured to: when the call identification information is identified in an information segment, send the information segment to a corresponding data processing engine to be called for data processing, and obtain a first corresponding to the information segment Response result data;
  • a second execution module configured to execute an operation corresponding to the operation request based on the information fragment when the call identification information is not identified in an information fragment, and use the operation result as a second corresponding to the information fragment Response result data;
  • the feedback module is configured to return the first response result data and / or the second response result data corresponding to all information pieces in the preset type information to the client as the response result data corresponding to the operation request.
  • the present application also proposes a computer-readable storage medium, where the computer-readable storage medium stores a data processing program, and the data processing program may be executed by at least one processor, so that the at least one A processor performs the following steps:
  • Receiving step receiving an operation request sent by the client and carrying preset type information, where the preset type information includes at least one piece of information;
  • Reading step reading the information piece in the preset type information, and identifying whether the information piece contains call identification information;
  • First execution step when the call identification information is identified in an information fragment, sending the information fragment to a corresponding data processing engine to be called for data processing, and obtaining a first response result corresponding to the information fragment data;
  • Second execution step when the call identification information is not recognized in an information segment, perform an operation corresponding to the operation request based on the information segment, and use the operation result as a second response result corresponding to the information segment data;
  • a feedback step returning first response result data and / or second response result data corresponding to all pieces of information in the preset type information to the client as response result data corresponding to the operation request.
  • the big data analysis system of the present application includes a client, a server, and multiple data processing engines.
  • the server is used to receive the operation request sent by the client; read the information fragment in the preset type information and identify whether the information fragment contains the call identification information; when the call identification information is identified, it is determined that the identified call identification information corresponds to
  • the data processing engine to be called sends the information segment corresponding to the call identification information to the data processing engine to be called for data processing, and receives the data processing result returned by the data processing engine to be called as the first response result data corresponding to the information segment; when When the call identification information is not recognized, the operation corresponding to the operation request is performed based on the information segment, and the operation result of the operation is used as the second response result data corresponding to the information segment; the response result data corresponding to the operation request is returned To the client.
  • the big data analysis system of the present application integrates multiple data processing engines to provide users with a unified interaction portal. Users can call multiple data processing engines for big data analysis by logging in to the interaction portal through a client. Processing, which simplifies user operations and improves the efficiency of big data analysis and processing.
  • FIG. 1 is a schematic diagram of an optional system architecture of the big data analysis system of the present application
  • FIG. 2 is a schematic diagram of an operating environment of the first, second, and third embodiments of a data processing program of this application;
  • FIG. 3 is a program module diagram of a first embodiment of a data processing program of this application.
  • FIG. 4 is a program module diagram of a second embodiment of a data processing program of this application.
  • FIG. 5 is a program module diagram of a third embodiment of a data processing program of the present application.
  • FIG. 6 is a schematic flowchart of a first embodiment of a data processing method of this application.
  • FIG. 7 is a schematic flowchart of a second embodiment of a data processing method of this application.
  • FIG. 8 is a schematic flowchart of a third embodiment of a data processing method of this application.
  • This application proposes a big data analysis system.
  • FIG. 1 is a schematic diagram of an optional system architecture of a big data analysis system of the present application.
  • the big data analysis system includes a server 1, a client 2, a plurality of data processing engines 3, and a file system 4, the server 1 and the client 2 are communicatively connected, and the server 1
  • the multiple data processing engines 3 and the file system 4 are all communicatively connected to each other, where:
  • the client 2 is used to provide a user operation interface for the user to initiate an operation request carrying preset type information through the operation interface.
  • the client 2 is further configured to send an operation request carrying preset type information to the server 1, and receive response result data for the operation request returned by the server 1;
  • the server 1 includes a memory and a processor, and a data processing program is stored on the memory, and is configured to receive an operation request sent by the client 2 and carry preset type information, and perform a corresponding operation for the operation request, and / or , Call the data processing engine 3 for data processing to generate response result data and return to the client 2.
  • the data processing engine 3 is configured to receive the information segment sent by the server 1, perform data processing according to the information segment, and return a data processing result to the server 1.
  • the data processing engine 3 includes a calculation engine (for example, Shell, Spark, jdbc, Python, R, etc.) and a storage engine (for example, hive, hbase, RMDB, etc.), where:
  • the calculation engine is used to calculate and analyze the data.
  • the storage engine is used to query, read, and write data.
  • the file system 4 may be a local file system or a distributed file system (for example, HDFS).
  • the file system 4 is used for storing big data and provides data processing engine 3 with data for data processing.
  • the file system 4 may be provided in the data processing engine 3, or may be provided separately from the data processing engine 3 and the server 2.
  • This application proposes a data processing program.
  • FIG. 2 is a schematic diagram of an operating environment of the first, second, and third embodiments of the data processing program 10 of the present application.
  • the data processing program 10 is installed and run on the server 1.
  • the server 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a server.
  • the server 1 may include, but is not limited to, a memory 11, a processor 12, and a display 13.
  • FIG. 2 only shows the server 1 with components 11-13, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the memory 11 may be an internal storage unit of the server 1 in some embodiments, such as a hard disk or a memory of the server 1.
  • the memory 11 may also be an external storage device of the server 1 in other embodiments, for example, a distributed storage device. Further, the memory 11 may include both an internal storage unit of the server 1 and an external storage device.
  • the memory 11 is used to store application software installed on the server 1 and various types of data, such as program codes of the data processing program 10.
  • the memory 11 may also be used to temporarily store data that has been output or is to be output.
  • the processor 12 may be a central processing unit (CPU), a microprocessor or other data processing chip in some embodiments, and is configured to run program codes or process data stored in the memory 11, for example, to execute a data processing program. 10 etc.
  • CPU central processing unit
  • microprocessor or other data processing chip in some embodiments, and is configured to run program codes or process data stored in the memory 11, for example, to execute a data processing program. 10 etc.
  • the display 13 may be an LED display, a liquid crystal display, a touch-type liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like.
  • the display 13 is used to display information processed in the server 1 and to display a visualized user interface.
  • the components 11-13 of the server 1 communicate with each other through a program bus.
  • FIG. 3 is a program module diagram of the first embodiment of the data processing program 10 of the present application.
  • the data processing program 10 may be divided into one or more modules, and the one or more modules are stored in the memory 11 and stored by one or more processors (the processor 12 in this embodiment). Execute to complete this application.
  • the data processing program 10 may be divided into a receiving module 101, a reading module 102, a first execution module 103, a second execution module 104, and a feedback module 105.
  • the module referred to in this application refers to a series of computer program instruction segments capable of performing specific functions, which is more suitable than the program to describe the execution process of the data processing program 10 in the server 1, wherein:
  • the receiving module 101 is configured to receive an operation request sent by the client and carries preset type information, where the preset type information includes at least one piece of information.
  • the operation request includes a data read operation request, a data write operation request, a data query operation request, a data sharing operation request, a data calculation analysis operation request, and the like.
  • the preset type information carried by the operation request includes data information (for example, a file name, etc.) to be read.
  • the preset type information carried by the operation request includes data to be written.
  • the preset type information that it carries includes data information to be queried, query conditions, and the like.
  • the preset type information carried by the operation request includes data information to be shared and a range of data sharing.
  • the preset type information carried by the operation request includes a data calculation analysis code.
  • the preset type information includes at least one piece of information, and a method for dividing the piece of information may be set according to a specific application scenario.
  • the data processing program 10 further includes an identity verification module (not shown in the figure), which is used to:
  • the user identity information is verified according to a predetermined identity verification rule, and the obtained verification result is fed back to the client.
  • the above user identity information includes user identification information and user identity characteristic information, wherein the user identity characteristic information includes user name information, user password information (the user password information can be a U shield, an electronic certificate, etc. as a storage medium), a dynamic code, etc. .
  • the user identity characteristic information may further include at least one of user biometric information and identity document information.
  • the user biometric information includes fingerprint information, face information, iris information, voiceprint information and other biometric information used to uniquely identify the user.
  • the above ID information includes ID number, passport number, etc.
  • the above authentication rules include:
  • the standard user identification characteristic information corresponding to the user identification information is searched. According to the found standard user identity characteristic information, verify the user identity characteristic information in the identity information. If the verification results are the same, the verification result is output as a successful verification, or if the verification results are different, the verification is output. The result is a verification failure.
  • the reading module 102 is configured to read an information segment in the preset type information, and identify whether the information segment includes call identification information.
  • the call identification information includes identification information of a data processing engine to be called (for example, a tag of the data processing engine to be called).
  • a first execution module 103 is configured to: when the call identification information is identified in an information segment, send the information segment to a corresponding data processing engine to be called for data processing, and obtain a first corresponding to the information segment. A response result data.
  • the preset type information is a data calculation analysis code
  • the data calculation analysis code includes a plurality of data calculation analysis code fragments.
  • the call identification information is identified in a data calculation analysis code fragment, according to a predetermined mapping relationship between the call identification information and the data processing engine, it is determined that the identified data processing engine corresponding to the call identification information is
  • the calculation engine A sends the data calculation analysis code segment corresponding to the call identification information to the calculation engine A, and the calculation engine A parses the data calculation analysis code segment and performs data according to the data calculation analysis code segment (the data can be sourced Calculation and analysis in a memory of a calculation engine, such as a local file system or a distributed file system, and returns the result of the calculation and analysis to the first execution module 103.
  • the preset type information is a data query code
  • the data query code includes a plurality of data query code fragments.
  • Engine C sends a data query code segment corresponding to the call identification information to storage engine C, and storage engine C parses the data query code segment and performs data query according to the data query code, and returns the query result to the first An execution module 103.
  • the second execution module 104 is configured to execute an operation corresponding to the operation request based on the information fragment when the call identification information is not identified in an information fragment, and use the operation result as the first corresponding to the information fragment. Second response result data.
  • the operation corresponding to the operation request is determined according to a predetermined mapping relationship between the operation request and the operation.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: reading the data to be read according to data information to be read in the information segment.
  • the corresponding operation is writing data to a memory (for example, a file system).
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: writing data to be written in the information segment into a memory.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: querying the data to be queried according to data information to be queried in the information segment, a query condition, and the like.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: generating a sharing link path of the data to be shared according to the data information to be shared in the information segment (the sharing link path is used to indicate the data to be shared). Storage address), and then, according to the predetermined mapping relationship between the sharing range and the sharing interface, determining a sharing range (for example, sharing to everyone and sharing to a user group) of the data to be shared in the preset type of information. Share interface, and add the generated share link path to the share interface.
  • the corresponding operation is to perform calculation and / or analysis on the data.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: analyzing the data calculation analysis code segment in the information segment, and performing data calculation and / or analysis processing according to the calculation analysis code segment.
  • the feedback module 105 is configured to return the first response result data and / or the second response result data corresponding to all pieces of information in the preset type information to the client as response result data corresponding to the operation request.
  • the big data analysis system of the present application includes a client, a server, and multiple data processing engines.
  • the server is used to receive the operation request sent by the client; read the information fragment in the preset type information, and identify whether the information fragment contains the call identification information; when the call identification information is identified, send the information fragment to the corresponding waiting Invoking a data processing engine for data processing, and obtaining first response result data corresponding to the information segment; when the call identification information is not identified, performing an operation corresponding to the operation request based on the information segment, and using the operation result of the operation as the The second response result data corresponding to the information segment; and returning the response result data corresponding to the operation request to the client.
  • the big data analysis system of the present application integrates multiple data processing engines to provide users with a unified interaction portal. Users can call multiple data processing engines for big data analysis by logging in to the interaction portal through a client. Processing, which simplifies user operations and improves the efficiency of big data analysis and processing.
  • the data processing program 10 further includes a review module (not shown in the figure), and the review module is configured to:
  • the user behavior data is recorded in real time, and the user behavior data is saved to a user behavior log.
  • the identification information of the user behavior log to be reviewed is obtained from the log review instruction, and the corresponding user behavior log to be reviewed is found according to the identification information.
  • the definition of the above abnormal behavior data may be set according to a specific application scenario.
  • the user behavior log is reviewed, so that the abnormal behavior of the user can be detected in time, and a prompt message is issued to further improve the security of the big data analysis system.
  • FIG. 4 is a program module diagram of a second embodiment of a data processing program 10 of the present application.
  • the data processing program of this embodiment includes modules 101 to 105 of the first embodiment, and the difference between this embodiment and the first embodiment is that in this embodiment, the operation request further includes user identification information.
  • the data processing program 10 of this embodiment further includes a first query module 106, a second query module 107, and a determination module 108, where:
  • the first query module 106 is configured to query the user role information corresponding to the user identification information in the operation request according to a mapping relationship between the predetermined user identification information and the user role information.
  • the user role information includes a user role identifier, a user role name, or a user role short name.
  • the second querying module 107 is configured to query the operation permission set corresponding to the user according to the mapping relationship between the predetermined user role information and the operation permission set.
  • permission settings can be set for each user role in advance, for example, each user role is set with a corresponding operation permission set, and the user role information corresponding to the user role is associated with and mapped with the corresponding operation permission set, and the The mapping relationship between the user role information and the corresponding operation authority set (for example, a mapping table between the user role information and the corresponding operation authority set is saved).
  • the so-called operation authority refers to the right to operate an operation object, and each operation authority records a mapping relationship between an operation object and an operation. For example, if you want to set the read permission of a file, you can store the mapping between the identification information of the file and the read operation as an operation permission, and add the operation permission to the corresponding operation permission set . If you want to set the sharing permission of a file, you can store the mapping relationship between the identification information of the file, the file sharing operation, and the sharing scope (for example, within the user group) as an operation permission, and save the operation. The permission is added to the corresponding operation permission set.
  • the mapping relationship between the identification information of the resource configuration information and the editing operation can be stored as an operation permission, and the operation permission is added to the corresponding operation permission set.
  • the editing authority for the resource configuration information corresponding to some computing engines for example, Shell, Python
  • At least one operation authority corresponding to a user role forms an operation authority set of the user role.
  • a user group may also be set, and the user group includes at least one user member.
  • User group information corresponding to each user group is stored.
  • the user group information includes user group identification information, user identification information of user group members, and user role information of user group members.
  • a corresponding group role can be configured for the user, and the operation permission set corresponding to the group role can be set (for example, the operation permission set includes sharing permission in the group, reading in the group Write permission, etc.).
  • the group role information may be stored in the user role information, and the operation authority set corresponding to the group role information is also stored in the operation authority set corresponding to the user role.
  • the determining module 108 is configured to determine whether the user has the operation authority of the operation corresponding to the operation request according to the operation authority set corresponding to the user, the operation corresponding to the operation request, and preset type information.
  • the reading module 102 is called.
  • the response result data of the execution failure is fed back to the client.
  • the operation authority set corresponding to the user is queried for the operation authority matching the operation corresponding to the operation request and the preset type information.
  • the matching operation authority is found, it is determined that the user has the operation corresponding to the operation request.
  • the matching operation authority is not queried, it is determined that the user does not have the operation authority of the operation corresponding to the operation request.
  • the following uses a file sharing permission as an example to explain how to determine whether the operation permission matches the operation and preset type information corresponding to the operation request.
  • the sharing permission of the above file is represented by the mapping relationship between the identification information of the file, the file sharing operation, and the sharing scope.
  • This embodiment sets the corresponding operation authority for the user according to the user role. When the user has the operation authority for the operation corresponding to the operation request, the subsequent steps are performed; otherwise, the subsequent steps are refused.
  • This embodiment implements security management and control of user operations, and improves the security of the big data analysis system.
  • FIG. 5 is a program module diagram of a third embodiment of the data processing program 10 of the present application.
  • the first execution module 103 includes a determination unit 1031, a query unit 1032, a determination unit 1033, and a result output unit 1034. Among them:
  • a determining unit 1031 is configured to, when the call identification information is identified in an information segment, determine a waiting list corresponding to the identified call identification information according to a predetermined mapping relationship between the call identification information and a data processing engine. Call the data processing engine.
  • the querying unit 1032 is configured to query the resource configuration information corresponding to the user identification information in the operation request according to a predetermined mapping relationship between the user identification information and the resource configuration information.
  • the resource configuration information of the user may be set and saved in advance according to the user role information and the user group information of the user, where the setting of the user role and the user group may refer to the second embodiment.
  • the above resource configuration information includes configuration item information corresponding to at least one configuration item.
  • Each configuration item can be configured for a configuration object, and the configuration object includes a calling right (for example, a right to call a data processing engine) and a computing engine. Or the number of processes executed by the storage engine, the authentication information of the computing engine or the storage engine, and so on.
  • the above configuration item information includes configuration item identification information (for example, the name or number of the configuration item or an abbreviation, etc.), and the value of the configuration item (the value types here include numeric, text, address, and selection types).
  • Each configuration item can be set for a configuration object (that is, assign a value to the configuration item of a configuration item), and the configuration object includes a callable resource (for example, a callable computing engine or a storage engine), a call to a calculation Number of processes executed by the engine or storage engine, authentication information of the calculation engine or storage engine, etc.
  • a callable resource for example, a callable computing engine or a storage engine
  • a call to a calculation Number of processes executed by the engine or storage engine authentication information of the calculation engine or storage engine, etc.
  • the user can edit some configuration items in the above resource configuration information based on the operation authority possessed by the user.
  • the judging unit 1033 is configured to judge whether to invoke the data processing engine to be called according to the queried resource configuration information and a predetermined judgment rule.
  • the determining unit 1033 is specifically configured to:
  • the call authority information includes identification information of a callable data processing engine.
  • the identification information of the data processing engine to be called is queried in the call permission information. If found, it is determined that the user has the right to call the data processing engine to be called; otherwise, it is determined that it does not have the right to call the data processing engine to be called.
  • the authentication information corresponding to the data processing engine to be called includes identity information of the user corresponding to the data processing engine to be called, for example, identity information such as user name, user password, and user key registered on the data processing engine platform to be called. .
  • a result output unit 1034 is configured to: when it is determined to call the data processing engine to be called, send an information piece corresponding to the call identification information to the data processing engine to be called for data processing, and receive a return from the data processing engine The data processing result is used as the first response result data corresponding to the information segment. Alternatively, when it is determined that the data processing engine to be called is not called, outputting the first response result data corresponding to the information fragment is an execution failure.
  • the data processing program 10 further includes a monitoring module (not shown in the figure), the monitoring module is configured to:
  • a threshold of the number of interactive processes corresponding to the data processing engine to be called is obtained.
  • the number of interactive processes between the server and the data processing engine to be called is obtained in real time or at regular intervals, and it is monitored whether the number of interactive processes is greater than or equal to the threshold of the number of interactive processes.
  • the number of interactive processes between calling the data processing engine is less than the threshold of the number of interactive processes, and if not, it returns to continue monitoring whether the number of interactive processes is greater than or equal to the threshold of the number of interactive processes.
  • the present application proposes a data processing method. Applies to server.
  • FIG. 6 is a schematic flowchart of a first embodiment of a data processing method of the present application.
  • the method includes:
  • Step S10 Receive an operation request carrying preset type information sent by the client, where the preset type information includes at least one piece of information.
  • the operation request includes a data read operation request, a data write operation request, a data query operation request, a data sharing operation request, a data calculation analysis operation request, and the like.
  • the preset type information carried by the operation request includes data information (for example, a file name, etc.) to be read.
  • the preset type information carried by the operation request includes data to be written.
  • the preset type information that it carries includes data information to be queried, query conditions, and the like.
  • the preset type information carried by the operation request includes data information to be shared and a range of data sharing.
  • the preset type information carried by the operation request includes a data calculation analysis code.
  • the preset type information includes at least one piece of information, and a method for dividing the piece of information may be set according to a specific application scenario.
  • the method before step S10, the method further includes:
  • the user identity information is verified according to a predetermined identity verification rule, and the obtained verification result is fed back to the client.
  • the above user identity information includes user identification information and user identity characteristic information, wherein the user identity characteristic information includes user name information, user password information (the user password information can be a U shield, an electronic certificate, etc. as a storage medium), a dynamic code, etc. .
  • the user identity characteristic information may further include at least one of user biometric information and identity document information.
  • the user biometric information includes fingerprint information, face information, iris information, voiceprint information and other biometric information used to uniquely identify the user.
  • the above ID information includes ID number, passport number, etc.
  • the above authentication rules include:
  • the standard user identification characteristic information corresponding to the user identification information is searched. According to the found standard user identity characteristic information, verify the user identity characteristic information in the identity information. If the verification results are the same, the verification result is output as a successful verification, or if the verification results are different, the verification is output. The result is a verification failure.
  • Step S20 Read the information segment in the preset type information, and identify whether the information segment contains call identification information.
  • the call identification information includes identification information of a data processing engine to be called (for example, a tag of the data processing engine to be called).
  • step S30 when the call identification information is identified in an information segment, the information segment is sent to a corresponding data processing engine to be called for data processing, and first response result data corresponding to the information segment is obtained.
  • the preset type information is a data calculation analysis code
  • the data calculation analysis code includes a plurality of data calculation analysis code fragments.
  • the call identification information is identified in a data calculation analysis code fragment, according to a predetermined mapping relationship between the call identification information and the data processing engine, it is determined that the identified data processing engine corresponding to the call identification information is
  • the calculation engine A sends the data calculation analysis code segment corresponding to the call identification information to the calculation engine A, and the calculation engine A parses the data calculation analysis code segment and performs data according to the data calculation analysis code segment (the data can be sourced Calculation and analysis in a storage of a calculation engine, such as a local file system or a distributed file system), and returns the results of the calculation and analysis to the server.
  • the preset type information is a data query code
  • the data query code includes a plurality of data query code fragments.
  • Engine C sends a data query code segment corresponding to the call identification information to storage engine C, and storage engine C parses the data query code segment and performs data query according to the data query code, and returns the query result to the server .
  • step S40 when the call identification information is not identified in an information segment, the operation corresponding to the operation request is performed based on the information segment, and the operation result is used as the second response result data corresponding to the information segment.
  • the operation corresponding to the operation request is determined according to a predetermined mapping relationship between the operation request and the operation.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: reading the data to be read according to data information to be read in the information segment.
  • the corresponding operation is writing data to a memory (for example, a file system).
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: writing data to be written in the information segment into a memory.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: querying the data to be queried according to data information to be queried in the information segment, a query condition, and the like.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: generating a sharing link path of the data to be shared according to the data information to be shared in the information segment (the sharing link path is used to indicate the data to be shared). Storage address), and then, according to the predetermined mapping relationship between the sharing range and the sharing interface, determining a sharing range (for example, sharing to everyone and sharing to a user group) of the data to be shared in the preset type of information. Share interface, and add the generated share link path to the share interface.
  • the corresponding operation is to perform calculation and / or analysis on the data.
  • the step of performing the operation corresponding to the operation request based on the information segment specifically includes: analyzing the data calculation analysis code segment in the information segment, and performing data calculation and / or analysis processing according to the calculation analysis code segment.
  • step S50 the first response result data and / or the second response result data corresponding to all information pieces in the preset type information are returned to the client as response result data corresponding to the operation request.
  • This application receives the operation request sent by the client; reads the information fragment in the preset type of information, and identifies whether the information fragment contains the call identification information; when the call identification information is identified, the information fragment is sent to the corresponding data to be called
  • the processing engine performs data processing and obtains the first response result data corresponding to the information segment; when the call identification information is not recognized, the operation corresponding to the operation request is performed based on the information segment, and the operation result of the operation is used as the information segment Corresponding second response result data; and returning the response result data corresponding to the operation request to the client.
  • the big data analysis system of the present application integrates multiple data processing engines to provide users with a unified interaction portal. Users can call multiple data processing engines for big data analysis by logging in to the interaction portal through a client. Processing, which simplifies user operations and improves the efficiency of big data analysis and processing.
  • the method further includes the following steps:
  • the user behavior data is recorded in real time, and the user behavior data is saved to a user behavior log.
  • the identification information of the user behavior log to be reviewed is obtained from the log review instruction, and the corresponding user behavior log to be reviewed is found according to the identification information.
  • the definition of the above abnormal behavior data may be set according to a specific application scenario.
  • the user behavior log is reviewed, so that the abnormal behavior of the user can be detected in time, and a prompt message is issued to further improve the security of the big data analysis system.
  • FIG. 7 is a schematic flowchart of a second embodiment of a data processing method of the present application.
  • the data processing method of this embodiment includes steps S10 to S50 of the first embodiment, and the difference between this embodiment and the first embodiment is that in this embodiment, the operation request further includes user identification information.
  • the data processing method in this embodiment further includes the following steps after step S10 and before step S20:
  • Step S60 Query the user role information corresponding to the user identification information in the operation request according to a predetermined mapping relationship between the user identification information and the user role information.
  • the user role information includes a user role identifier, a user role name, or a user role short name.
  • Step S70 Query the operation authority set corresponding to the user according to the mapping relationship between the predetermined user role information and the operation authority set.
  • permission settings can be set for each user role in advance, for example, each user role is set with a corresponding operation permission set, and the user role information corresponding to the user role is associated with and mapped with the corresponding operation permission set, and the The mapping relationship between the user role information and the corresponding operation authority set (for example, a mapping table between the user role information and the corresponding operation authority set is saved).
  • the so-called operation authority refers to the right to operate an operation object, and each operation authority records a mapping relationship between an operation object and an operation. For example, if you want to set the read permission of a file, you can store the mapping between the identification information of the file and the read operation as an operation permission, and add the operation permission to the corresponding operation permission set . If you want to set the sharing permission of a file, you can store the mapping relationship between the identification information of the file, the file sharing operation, and the sharing scope (for example, within the user group) as an operation permission, and save the operation. The permission is added to the corresponding operation permission set.
  • the mapping relationship between the identification information of the resource configuration information and the editing operation can be stored as an operation permission, and the operation permission is added to the corresponding operation permission set.
  • the editing authority for the resource configuration information corresponding to some computing engines for example, Shell, Python
  • At least one operation authority corresponding to a user role forms an operation authority set of the user role.
  • a user group may also be set, and the user group includes at least one user member.
  • User group information corresponding to each user group is stored.
  • the user group information includes user group identification information, user identification information of user group members, and user role information of user group members.
  • a corresponding group role can be configured for the user, and the operation permission set corresponding to the group role can be set (for example, the operation permission set includes sharing permission in the group, reading in the group Write permission, etc.).
  • the group role information may be stored in the user role information, and the operation authority set corresponding to the group role information is also stored in the operation authority set corresponding to the user role.
  • Step S80 Determine whether the user has the operation authority of the operation corresponding to the operation request according to the operation authority set corresponding to the user, the operation corresponding to the operation request, and preset type information.
  • the operation authority set corresponding to the user is queried for the operation authority matching the operation corresponding to the operation request and the preset type information.
  • the matching operation authority is found, it is determined that the user has the operation corresponding to the operation request.
  • the matching operation authority is not queried, it is determined that the user does not have the operation authority of the operation corresponding to the operation request.
  • the following uses a file sharing permission as an example to explain how to determine whether the operation permission matches the operation and preset type information corresponding to the operation request.
  • the sharing permission of the above file is represented by the mapping relationship between the identification information of the file, the file sharing operation, and the sharing scope.
  • step S90 when the user has the operation authority of the operation corresponding to the operation request, go to step S20.
  • the response result data of the execution failure is fed back to the client.
  • This embodiment sets the corresponding operation authority for the user according to the user role. When the user has the operation authority for the operation corresponding to the operation request, the subsequent steps are performed; otherwise, the subsequent steps are refused.
  • This embodiment implements security management and control of user operations, and improves the security of the big data analysis system.
  • FIG. 8 is a schematic flowchart of a third embodiment of a data processing method of the present application.
  • the step S30 includes:
  • Step S31 When the call identification information is identified in an information segment, determine the data to be called corresponding to the identified call identification information according to the mapping relationship between the call identification information determined in advance and the data processing engine. engine.
  • Step S32 Query the resource configuration information corresponding to the user identification information in the operation request according to a predetermined mapping relationship between the user identification information and the resource configuration information.
  • the resource configuration information of the user may be set and saved in advance according to the user role information and the user group information of the user, where the setting of the user role and the user group may refer to the second embodiment.
  • the above resource configuration information includes configuration item information corresponding to at least one configuration item.
  • Each configuration item can be configured for a configuration object, and the configuration object includes a calling right (for example, a right to call a data processing engine) and a computing engine. Or the number of processes executed by the storage engine, the authentication information of the computing engine or the storage engine, and so on.
  • the above configuration item information includes configuration item identification information (for example, the name or number of the configuration item or an abbreviation, etc.), and the value of the configuration item (the value types here include numeric, text, address, and selection types).
  • Each configuration item can be set for a configuration object (that is, assign a value to the configuration item of a configuration item), and the configuration object includes a callable resource (for example, a callable computing engine or storage engine), a call to a calculation Number of processes executed by the engine or storage engine, authentication information of the calculation engine or storage engine, etc.
  • a callable resource for example, a callable computing engine or storage engine
  • a call to a calculation Number of processes executed by the engine or storage engine authentication information of the calculation engine or storage engine, etc.
  • the user can edit some configuration items in the above resource configuration information based on the operation authority possessed by the user.
  • Step S33 Determine whether to invoke the data processing engine to be invoked according to the queried resource configuration information and a predetermined determination rule.
  • the step S33 specifically includes:
  • the call authority information includes identification information of a callable data processing engine.
  • the identification information of the data processing engine to be called is queried in the call permission information. If found, it is determined that the user has the right to call the data processing engine to be called; otherwise, it is determined that it does not have the right to call the data processing engine to be called.
  • the authentication information corresponding to the data processing engine to be called includes identity information of the user corresponding to the data processing engine to be called, for example, identity information such as user name, user password, and user key registered on the data processing engine platform to be called. .
  • Step S34 When it is determined to call the data processing engine to be called, send an information piece corresponding to the call identification information to the data processing engine to be called for data processing, and receive a data processing result returned by the data processing engine As the first response result data corresponding to the information piece.
  • Step S35 When it is determined that the data processing engine to be called is not called, outputting the first response result data corresponding to the information fragment is an execution failure.
  • the method further includes:
  • a threshold of the number of interactive processes corresponding to the data processing engine to be called is obtained.
  • the number of interactive processes between the server and the data processing engine to be called is obtained in real time or periodically, and it is monitored whether the number of interactive processes is greater than or equal to the threshold of the number of interactive processes, and if so, stop establishing a new interactive process until the server and the pending The number of interactive processes between calling the data processing engine is less than the threshold of the number of interactive processes, and if not, it returns to continue monitoring whether the number of interactive processes is greater than or equal to the threshold of the number of interactive processes.
  • the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores a data processing program, and the data processing program can be executed by at least one processor, so that the at least one processor executes Steps of the data processing method in any of the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Storage Device Security (AREA)
  • Stored Programmes (AREA)

Abstract

La présente invention concerne un système d'analyse de mégadonnées, un serveur, un procédé de traitement de données, un programme et un support de stockage. Le système d'analyse de mégadonnées de la présente invention comprend un client, un serveur et de multiples moteurs de traitement de données. Le serveur est utilisé pour identifier si des fragments d'informations contiennent des informations d'identifiant d'appel; lorsque des informations d'identifiant d'appel sont identifiées, transmettre les fragments d'informations à des moteurs de traitement de données correspondants devant être appelés pour un traitement de données, et recevoir un résultat de traitement de données qui est renvoyé. Par comparaison avec la technologie existante, le système d'analyse de mégadonnées de la présente invention intègre de multiples moteurs de traitement de données pour fournir à des utilisateurs une entrée interactive uniforme, ce qui simplifie les opérations des utilisateurs.
PCT/CN2018/107487 2018-06-28 2018-09-26 Système d'analyse de mégadonnées, serveur, procédé de traitement de données, programme et support de stockage WO2020000716A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810682823.5A CN109062965B (zh) 2018-06-28 2018-06-28 大数据分析系统、服务器、数据处理方法和存储介质
CN201810682823.5 2018-06-28

Publications (1)

Publication Number Publication Date
WO2020000716A1 true WO2020000716A1 (fr) 2020-01-02

Family

ID=64818056

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/107487 WO2020000716A1 (fr) 2018-06-28 2018-09-26 Système d'analyse de mégadonnées, serveur, procédé de traitement de données, programme et support de stockage

Country Status (2)

Country Link
CN (1) CN109062965B (fr)
WO (1) WO2020000716A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739663B (zh) * 2018-12-29 2023-07-14 深圳前海微众银行股份有限公司 作业处理方法、装置、设备及计算机可读存储介质
CN112363831B (zh) * 2020-11-10 2021-12-10 上海华锐软件有限公司 风控处理方法、装置、计算机设备和存储介质
CN112306586A (zh) * 2020-11-20 2021-02-02 深圳前海微众银行股份有限公司 数据处理方法、装置、设备及计算机存储介质
CN112817997A (zh) * 2021-02-24 2021-05-18 广州市品高软件股份有限公司 一种分布式计算引擎使用动态用户访问s3对象存储的方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176798A (zh) * 2013-02-21 2013-06-26 用友软件股份有限公司 数据交互系统和数据交互方法
CN104008436A (zh) * 2013-02-26 2014-08-27 中国移动通信集团浙江有限公司 一种内容管理集成方法和系统
CN104660680A (zh) * 2015-01-26 2015-05-27 青岛市环境信息中心 一种应用系统集成云终端平台及集成方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066546B (zh) * 2017-03-20 2021-03-09 国家计算机网络与信息安全管理中心 一种基于mpp引擎的跨数据中心快速查询方法和系统
CN108038213A (zh) * 2017-12-21 2018-05-15 中国农业银行股份有限公司 一种数据处理的方法、客户端、服务器及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176798A (zh) * 2013-02-21 2013-06-26 用友软件股份有限公司 数据交互系统和数据交互方法
CN104008436A (zh) * 2013-02-26 2014-08-27 中国移动通信集团浙江有限公司 一种内容管理集成方法和系统
CN104660680A (zh) * 2015-01-26 2015-05-27 青岛市环境信息中心 一种应用系统集成云终端平台及集成方法

Also Published As

Publication number Publication date
CN109062965A (zh) 2018-12-21
CN109062965B (zh) 2023-04-18

Similar Documents

Publication Publication Date Title
CN108810006B (zh) 资源访问方法、装置、设备及存储介质
CN109033123B (zh) 基于大数据的查询方法、装置、计算机设备和存储介质
CN107948203B (zh) 一种容器登录方法、应用服务器、系统及存储介质
CN110414268B (zh) 访问控制方法、装置、设备及存储介质
US11196772B2 (en) Data access policies
WO2020000716A1 (fr) Système d'analyse de mégadonnées, serveur, procédé de traitement de données, programme et support de stockage
US20220327122A1 (en) Performing data mining operations within a columnar database management system
WO2019205380A1 (fr) Dispositif électronique, procédé et programme de traitement de données fondés sur une chaîne de blocs et support d'informations d'ordinateur
US10691822B1 (en) Policy validation management
WO2021013033A1 (fr) Procédé, appareil, dispositif et système d'opération de fichier, et support de stockage lisible par ordinateur
US11520751B2 (en) System and method for information storage using blockchain databases combined with pointer databases
TW202025020A (zh) 基於區塊鏈的內容管理系統及方法、裝置、電子設備
CN114422197A (zh) 一种基于策略管理的权限访问控制方法及系统
WO2023056727A1 (fr) Procédé et appareil de contrôle d'accès, et dispositif et support de stockage lisible
Fu et al. Data correlation‐based analysis methods for automatic memory forensic
CN116010926A (zh) 登陆认证方法、装置、计算机设备和存储介质
CN112583890B (zh) 基于企业办公系统的消息推送方法、装置和计算机设备
US10496840B1 (en) Recommending security controls for similar data
WO2023236637A1 (fr) Procédé et dispositif de gestion de données
US11750660B2 (en) Dynamically updating rules for detecting compromised devices
US20200380158A1 (en) Systems and methods for managing data expectations
WO2023093139A1 (fr) Procédé et appareil de création de ressources et dispositif électronique et support de stockage
CN117675396A (zh) 用户账户数据获取方法、系统、装置和计算机设备
CN116975893A (zh) 访问请求处理方法及装置、存储介质、计算机设备
CN113076331A (zh) 中台数据处理方法、装置、设备、存储介质及程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18924702

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18924702

Country of ref document: EP

Kind code of ref document: A1