CN109344620B - Detection method based on hadoop security configuration - Google Patents

Detection method based on hadoop security configuration Download PDF

Info

Publication number
CN109344620B
CN109344620B CN201811040829.9A CN201811040829A CN109344620B CN 109344620 B CN109344620 B CN 109344620B CN 201811040829 A CN201811040829 A CN 201811040829A CN 109344620 B CN109344620 B CN 109344620B
Authority
CN
China
Prior art keywords
hadoop
data
flume
user
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811040829.9A
Other languages
Chinese (zh)
Other versions
CN109344620A (en
Inventor
何金栋
唐志军
林承华
赵志超
吴丹
吴丽进
谢新志
罗富财
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd, State Grid Fujian Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
Priority to CN201811040829.9A priority Critical patent/CN109344620B/en
Publication of CN109344620A publication Critical patent/CN109344620A/en
Application granted granted Critical
Publication of CN109344620B publication Critical patent/CN109344620B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Storage Device Security (AREA)

Abstract

The invention relates to a detection method based on hadoop security configuration. The method comprises the steps of detecting a Hadoop output end, detecting a distributed NoSQL database and detecting a network management server, wherein the detection of the Hadoop output end sets safe Flume agents for an HDFS (Hadoop distributed file system) and an HBase output end, Flume marks are set in a KDC (KDC) and are generated into keytab files, a Flume user needs to be a Hadoop super user, a user group is set for a Flume simulation user, different sources are allowed to be simulated, data are injected into the Hadoop, and the fact that the user right in the Hadoop is always kept in compliance in the data intake process, the setting is carried out by adopting a safety configuration program, and the safety inspection is carried out on the Flume simulation user. The invention detects the security configuration of the Hadoop output end, the distributed NoSQL database and the network management server by simulating the process of the user end and data transmission, thereby avoiding the vulnerability caused by the upgrading and updating of the system in advance.

Description

Detection method based on hadoop security configuration
Technical Field
The invention belongs to the field of big data, and particularly relates to a detection method based on hadoop security configuration.
Background
Hadoop is composed of a plurality of components, and Sqoop is used for realizing structured data transmission conversion among RDBMS, NoSQL and Hadoop. The initial version of Sqoop, Sqoop1, acts as a command line client tool, generating MapReduce code based on structured data source metadata. The connection parameters are set through a command line and comprise password information and the like required by submitting MapReduce operation to Hadoop. In this version, there is no secure way to include the user password, all transmitted in the clear. Therefore, the version is suitable for bringing security holes in a production environment, and because the user password is stored in the attribute file as a plaintext, the configuration of the system cannot be directly used every time the system is updated, and the security configuration needs to be detected.
Disclosure of Invention
The invention aims to provide a detection method based on Hadoop security configuration, which detects the security configuration of a Hadoop output end, a distributed NoSQL database and a network management server by simulating the processes of a user end and data transmission, and avoids bugs caused by upgrading and updating of a system in advance.
In order to achieve the purpose, the technical scheme of the invention is as follows: a detection method based on Hadoop security configuration comprises the detection of a Hadoop output end, the detection of a distributed NoSQL database and the detection of a network management server; wherein,
the detection of the Hadoop output end is carried out according to the following steps in sequence:
step 1.1: setting a safe Flume proxy channel for the output ends of the HDFS and the HBase, and setting a Flume identifier in the KDC and generating a keytab file;
step 1.2: establishing that a flash user needs to be a Hadoop super user, establishing a user group for a flash simulation user, allowing different sources to be simulated, injecting data into the Hadoop, and ensuring that the user authority in the Hadoop is always observed in the data intake process;
step 1.3: setting by adopting a security configuration program, and carrying out security check on the Flume simulation user;
the detection of the distributed NoSQL database is carried out according to the following steps in sequence:
step 2.1: the client contacts the ZooKeeper to acquire the position of the main key and then contacts the corresponding RegionServer to acquire data;
step 2.2: the Region Server forwards the request to a correct Region, and the Region returns data to the client;
step 2.3: detecting the voucher and the bill of the data transmission process of the RegionServer and the client in the step 2.2;
the detection of the network management server is carried out in sequence according to the following steps:
step 3.1: installing HttpFS in the gateway server, and operating through a preconfigured Tomeat;
step 3.2: simulating an HTTP proxy to authenticate the client, and then serving as the proxy to provide cluster file access;
step 3.3: and detecting the running of the Web application of the HTTP proxy and the client data access process in the step 3.2.
In an embodiment of the present invention, the flash in step 1.1 is a distributed, reliable and near-real-time data acquisition system in the Hadoop ecosystem, and each flash source provides its own authentication and authorization mechanism, which can be configured in advance according to the type of the flash source.
In an embodiment of the present invention, in step 2.1, Zookeeper is a coordination service of the distributed NoSQL database and stores relevant information, and RegionServer is a node of the distributed NoSQL database.
In an embodiment of the present invention, the Flume proxy channel in step 1.1 refers to one of a memory channel, a database channel and a file channel, both the database channel and the file channel need to consider data security, the database channel adopts a standard database security mechanism to support authentication and authorization, a user name and a password are set in a connection.
In an embodiment of the present invention, the region server in step 2.2 communicates with the HDFS through the HBase daemon to store data.
In an embodiment of the present invention, Http fs is used as a proxy in step 3.1, and WebHDFS is used to access cluster resources, the biggest limitation is that large-scale data cannot be transmitted through Http interface, Http fs operates as a Tomcat-based Web application, when transmitting data to a cluster, the data needs to be moved through the application, and a native RPC is used to transmit large-scale data.
Compared with the prior art, the invention has the following beneficial effects: the invention detects the security configuration of the Hadoop output end, the distributed NoSQL database and the network management server by simulating the process of the user end and data transmission, thereby avoiding the vulnerability caused by the upgrading and updating of the system in advance and avoiding larger negative influence.
Detailed Description
The following specifically describes the technical means of the present invention.
The invention provides a detection method based on Hadoop security configuration, which comprises the detection of a Hadoop output end, the detection of a distributed NoSQL database and the detection of a network management server; wherein,
the detection of the Hadoop output end is carried out according to the following steps in sequence:
step 1.1: setting a safe Flume proxy channel for the output ends of the HDFS and the HBase, and setting a Flume identifier in the KDC and generating a keytab file;
step 1.2: establishing that a flash user needs to be a Hadoop super user, establishing a user group for a flash simulation user, allowing different sources to be simulated, injecting data into the Hadoop, and ensuring that the user authority in the Hadoop is always observed in the data intake process;
step 1.3: setting by adopting a security configuration program, and carrying out security check on the Flume simulation user;
the detection of the distributed NoSQL database is carried out according to the following steps in sequence:
step 2.1: the client contacts the ZooKeeper to acquire the position of the main key and then contacts the corresponding RegionServer to acquire data;
step 2.2: the Region Server forwards the request to a correct Region, and the Region returns data to the client;
step 2.3: detecting the voucher and the bill of the data transmission process of the RegionServer and the client in the step 2.2;
the detection of the network management server is carried out in sequence according to the following steps:
step 3.1: installing HttpFS in the gateway server, and operating through a preconfigured Tomeat;
step 3.2: simulating an HTTP proxy to authenticate the client, and then serving as the proxy to provide cluster file access;
step 3.3: and detecting the running of the Web application of the HTTP proxy and the client data access process in the step 3.2.
The flash in step 1.1 is a distributed, reliable and near-real-time data acquisition system in the Hadoop ecosystem, each flash source provides its own authentication and authorization mechanism, and the flash sources can be configured in advance according to the type of the flash source. The Flume proxy channel in the step 1.1 refers to one of a memory channel, a database channel and a file channel, the database channel and the file channel need to consider data security, the database channel adopts a standard database security mechanism to support authentication and authorization, a user name and a password are set in a connection.
In step 2.1, Zookeeper is a coordination service of the distributed NoSQL database and stores relevant information, and RegionServer is a node of the distributed NoSQL database.
In step 2.2, the RegionServer communicates with the HDFS through the HBase daemon to store data.
In step 3.1, Http fs is used as a proxy to access cluster resources by using WebHDFS, the biggest limitation is that large-scale data cannot be transmitted through Http interfaces, Http fs is operated as a Tomcat-based Web application, when data is transmitted to a cluster, the data needs to be moved through the application, and a native RPC is used to transmit the large-scale data.
The following are specific implementation examples of the present invention.
A detection method based on Hadoop security configuration comprises detection of a Hadoop output end, detection of a distributed NoSQL database and detection of a network management server, wherein the detection of the Hadoop output end is sequentially carried out according to the following steps:
step 1.1: setting a safe Flume proxy channel for the output ends of the HDFS and the HBase, and setting a Flume identifier in the KDC and generating a keytab file;
step 1.2: establishing that a flash user needs to be a Hadoop super user, establishing a user group for a flash simulation user, allowing different sources to be simulated, injecting data into the Hadoop, and ensuring that the user authority in the Hadoop is always observed in the data intake process;
step 1.3: setting by adopting a security configuration program and carrying out security check on the Flume simulation user;
the detection of the distributed NoSQL database is carried out according to the following steps in sequence:
step 2.1: the client contacts the ZooKeeper to acquire the position of the main key and then contacts the corresponding RegionServer to acquire data;
step 2.2: the Region Server forwards the request to a correct Region, and the Region returns data to the client;
step 2.3: detecting the voucher and the bill of the data transmission process of the RegionServer and the client in the step two;
the detection of the network management server is carried out in sequence according to the following steps:
step 3.1: installing HttpFS in the gateway server, and operating through a preconfigured Tomeat;
step 3.2: simulating an HTTP proxy to authenticate the client, and then serving as the proxy to provide cluster file access;
step 3.3: and detecting the operation of the Web application of the HTTP proxy and the client data access process in the second step.
In this embodiment, in step four, the scanning engine crawls the directory and the file of the scanned object and performs crawling and probing simultaneously.
In this embodiment, the flash in step 1.1 is a distributed, reliable and near-real-time data acquisition system in the Hadoop ecosystem, and each flash source provides its own authentication and authorization mechanism, which may be configured in advance according to the type of the flash source.
In this embodiment, in step 2.1, Zookeeper is a coordinating service of the distributed NoSQL database and stores relevant information, and RegionServer is a node of the distributed NoSQL database.
In this embodiment, the flash proxy channel in step 1.1 refers to one of a memory channel, a database channel, and a file channel, where data security needs to be considered for both the database channel and the file channel, the database channel uses a standard database security mechanism to support authentication and authorization, a user name and a password are set in a connection.
In this embodiment, the HBase daemon in the RegionServer communicates with the HDFS to store data in step 2.2.
In this embodiment, httpFS serves as a proxy in step 3.1, and WebHDFS is used to access cluster resources, the biggest limitation is that large-scale data cannot be transmitted through an Http interface, httpFS operates as a Tomcat-based Web application, and when data is transmitted to a cluster, the data needs to be moved through the application, and a native RPC is used to transmit the large-scale data.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (6)

1. A detection method based on Hadoop security configuration is characterized by comprising the steps of detecting a Hadoop output end, detecting a distributed NoSQL database and detecting a gateway server; wherein,
the detection of the Hadoop output end is carried out according to the following steps in sequence:
step 1.1: setting a safe Flume proxy channel for the output ends of the HDFS and the HBase, setting a Flume identifier in the KDC and generating a keytab file;
step 1.2: establishing a flash user as a Hadoop super user, establishing a user group for a flash simulation user, allowing different sources to be simulated, injecting data into the Hadoop, and ensuring that the user authority in the Hadoop is always observed in the data intake process;
step 1.3: setting by adopting a security configuration program, and carrying out security check on the Flume simulation user;
the detection of the distributed NoSQL database is carried out according to the following steps in sequence:
step 2.1: the client contacts the ZooKeeper to acquire the position of the main key and then contacts the corresponding RegionServer to acquire data;
step 2.2: the Region Server forwards the request to a correct Region, and the Region returns data to the client;
step 2.3: detecting the voucher and the bill of the data transmission process of the RegionServer and the client in the step 2.2;
the detection of the gateway server is carried out according to the following steps in sequence:
step 3.1: installing HttpFS in the gateway server, and operating through a preconfigured Tomeat;
step 3.2: simulating an HTTP proxy to authenticate the client, and then serving as the proxy to provide cluster file access;
step 3.3: and 3.2, detecting the running condition of the Web application in the data access process of the HTTP proxy and the client in the step 3.2.
2. The Hadoop security configuration-based detection method according to claim 1, wherein the Flume in step 1.1 is a distributed, reliable, near-real-time data acquisition system in a Hadoop ecosystem, each Flume source provides its own authentication and authorization mechanism, and the Flume sources can be configured in advance according to their types.
3. The hadoop security configuration based detection method according to claim 1, wherein Zookeeper in step 2.1 is a coordinating service of the distributed NoSQL database and stores related information, and RegionServer is a node of the distributed NoSQL database.
4. The detection method based on hadoop security configuration according to claim 1, wherein the Flume proxy channel in step 1.1 is one of a memory channel, a database channel and a file channel, both the database channel and the file channel need to consider data security, the database channel adopts a standard database security mechanism to support authentication and authorization, a user name and a password are set in a connection.
5. The hadoop security configuration based detection method according to claim 1, wherein the region server in step 2.2 stores data by communicating with the HDFS through the HBase daemon.
6. The hadoop security configuration based detection method according to claim 1, wherein Http fs is used as a proxy in step 3.1 to access cluster resources using WebHDFS, the biggest limitation is that large-scale data cannot be transmitted through Http interface, Http fs is used as a Tomcat-based Web application, and when transmitting data to a cluster, the data needs to be moved through the application, and a native RPC is used to transmit large-scale data.
CN201811040829.9A 2018-09-07 2018-09-07 Detection method based on hadoop security configuration Active CN109344620B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811040829.9A CN109344620B (en) 2018-09-07 2018-09-07 Detection method based on hadoop security configuration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811040829.9A CN109344620B (en) 2018-09-07 2018-09-07 Detection method based on hadoop security configuration

Publications (2)

Publication Number Publication Date
CN109344620A CN109344620A (en) 2019-02-15
CN109344620B true CN109344620B (en) 2021-08-31

Family

ID=65304950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811040829.9A Active CN109344620B (en) 2018-09-07 2018-09-07 Detection method based on hadoop security configuration

Country Status (1)

Country Link
CN (1) CN109344620B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360882A (en) * 2021-05-27 2021-09-07 北京百度网讯科技有限公司 Cluster access method, device, electronic equipment and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1820262A (en) * 2003-06-09 2006-08-16 范拉诺公司 Event monitoring and management
CN101582883A (en) * 2009-06-26 2009-11-18 西安电子科技大学 System and method for managing security of general network
CN101605066A (en) * 2009-04-22 2009-12-16 网经科技(苏州)有限公司 Telecommunication network behavior method for real-time monitoring based on multilayer data interception
CN101931627A (en) * 2010-08-26 2010-12-29 福建星网锐捷网络有限公司 Security detection method, security detection device and network equipment
CN104798355A (en) * 2012-09-18 2015-07-22 思杰系统有限公司 Mobile device management and security
CN104981783A (en) * 2013-03-07 2015-10-14 思杰系统有限公司 Dynamic configuration in cloud computing environments
CN106209814A (en) * 2016-07-04 2016-12-07 安徽天达网络科技有限公司 A kind of distributed network intrusion prevention system
CN106329729A (en) * 2016-09-13 2017-01-11 江苏方天电力技术有限公司 Intelligent power distribution terminal based on distributed type virtual plug-in
CN107025409A (en) * 2017-06-27 2017-08-08 中经汇通电子商务有限公司 A kind of data safety storaging platform
CN107733863A (en) * 2017-09-07 2018-02-23 济南双瑞软件有限公司 Daily record adjustment method and device under a kind of distributed hadoop environment
CN108494810A (en) * 2018-06-11 2018-09-04 中国人民解放军战略支援部队信息工程大学 Network security situation prediction method, apparatus and system towards attack

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9130920B2 (en) * 2013-01-07 2015-09-08 Zettaset, Inc. Monitoring of authorization-exceeding activity in distributed networks

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1820262A (en) * 2003-06-09 2006-08-16 范拉诺公司 Event monitoring and management
CN101605066A (en) * 2009-04-22 2009-12-16 网经科技(苏州)有限公司 Telecommunication network behavior method for real-time monitoring based on multilayer data interception
CN101582883A (en) * 2009-06-26 2009-11-18 西安电子科技大学 System and method for managing security of general network
CN101931627A (en) * 2010-08-26 2010-12-29 福建星网锐捷网络有限公司 Security detection method, security detection device and network equipment
CN104798355A (en) * 2012-09-18 2015-07-22 思杰系统有限公司 Mobile device management and security
CN104981783A (en) * 2013-03-07 2015-10-14 思杰系统有限公司 Dynamic configuration in cloud computing environments
CN106209814A (en) * 2016-07-04 2016-12-07 安徽天达网络科技有限公司 A kind of distributed network intrusion prevention system
CN106329729A (en) * 2016-09-13 2017-01-11 江苏方天电力技术有限公司 Intelligent power distribution terminal based on distributed type virtual plug-in
CN107025409A (en) * 2017-06-27 2017-08-08 中经汇通电子商务有限公司 A kind of data safety storaging platform
CN107733863A (en) * 2017-09-07 2018-02-23 济南双瑞软件有限公司 Daily record adjustment method and device under a kind of distributed hadoop environment
CN108494810A (en) * 2018-06-11 2018-09-04 中国人民解放军战略支援部队信息工程大学 Network security situation prediction method, apparatus and system towards attack

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于网络的信息安全技术概述;蔡智澄;《网络安全技术与应用》;20060131;全文 *

Also Published As

Publication number Publication date
CN109344620A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
US11665006B2 (en) User authentication with self-signed certificate and identity verification
CN109981561B (en) User authentication method for migrating single-body architecture system to micro-service architecture
CN108235806B (en) Method, device and system for safely accessing block chain, storage medium and electronic equipment
US10902016B2 (en) Autonomous interdependent repositories
CN112422532B (en) Service communication method, system and device and electronic equipment
US10013668B2 (en) Secure storage of enterprise certificates for cloud services
CN110912707B (en) Block chain-based digital certificate processing method, device, equipment and storage medium
CN110599095B (en) Block chain network-based hazardous waste treatment method and node of block chain network
CN110808840B (en) Service processing method and device, electronic equipment and storage medium
US20110302277A1 (en) Methods and apparatus for web-based migration of data in a multi-tenant database system
CN111258599B (en) Firmware upgrade method, system and computer readable storage medium
CN112686668A (en) Alliance chain cross-chain system and method
CN109362074B (en) Method for h5 and server side safety communication in mixed mode APP
CN103220259A (en) Using method, call method, device and system of Oauth application programming interface (API)
CN105187372A (en) Method for data processing based on mobile application entrance, device and system
CN112671580A (en) QAR data management method based on block chain technology
US20130103651A1 (en) Telemetry file hash and conflict detection
CN110740038B (en) Blockchain and communication method, gateway, communication system and storage medium thereof
CN111414381A (en) Data processing method and device, electronic equipment and storage medium
US20200259810A1 (en) Systems and methods for blockchain-based secure storage
CN111597543A (en) Wide-area process access authority authentication method and system based on block chain intelligent contract
CN103870727A (en) Unified authority management method and system
WO2023023275A3 (en) Data sharing solution
CN109344620B (en) Detection method based on hadoop security configuration
US11379434B2 (en) Efficient and automatic database patching using elevated privileges

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: He Jindong

Inventor after: Luo Fucai

Inventor after: Guo Jingdong

Inventor after: Zheng Zhou

Inventor after: Tang Zhijun

Inventor after: Lin Chenghua

Inventor after: Zhao Zhichao

Inventor after: Wu Dan

Inventor after: Wu Lijin

Inventor after: Xie Xinzhi

Inventor before: He Jindong

Inventor before: Tang Zhijun

Inventor before: Lin Chenghua

Inventor before: Zhao Zhichao

Inventor before: Wu Dan

Inventor before: Wu Lijin

Inventor before: Xie Xinzhi

Inventor before: Luo Fucai