Hadoop Ranger is a permission solution for big data scenarios. A user adopting the compute/storage separation mode can host data in Tencent COS. However, COS uses the CAM permission system, meaning that the user roles and permission policies could be different from those of Hadoop Ranger. Therefore, we introduce a solution to integrate COS with Ranger herein.
In the Hadoop permission system, the authentication is offered by Kerberos and authorized by Ranger. On the basis of this, the following components are introduced to support the COS Ranger permission solution:
Note:
The components above are open-source and stable. You can install them as needed.
COS-Ranger-Plugin extends the service types of the Ranger Admin console. Users can configure the COS-related permissions in the Ranger console.
You can go to Github > ranger-plugin to obtain the source code.
v1.1 or above
ranger/ews/webapp/WEB-INF/classes/ranger-plugins
.cos-chdfs-ranger-plugin-xxx.jar
in the COS directory. Note that you should at least have read permission on the JAR package.## Create the service. The Ranger admin account and password, as well as the Ranger service address should be specified.
## For the Tencent Cloud EMR cluster, the root account is the admin, and the password is the root account’s password that is set when the EMR cluster is created. You need to replace the Ranger service address with the master node IP of the EMR.
adminUser=root
adminPasswd=xxxxxx
rangerServerAddr=10.0.0.1:6080
curl -v -u${adminUser}:${adminPasswd} -X POST -H "Accept:application/json" -H "Content-Type:application/json" -d @./cos-ranger.json http://${rangerServerAddr}/service/plugins/definitions
## To delete a defined service, for example, the service created above, pass the service ID that is returned when you created the service.
serviceId=102
curl -v -u${adminUser}:${adminPasswd} -X DELETE -H "Accept:application/json" -H "Content-Type:application/json" http://${rangerServerAddr}/service/plugins/definitions/${serviceId}
cos
or cos_test
, as shown in the following figure:policy.grantrevoke.auth.users
needs to be set to the name of the user that is used to launch the COSRangerService service (i.e., the user that is allowed to pull permission policies). You are advised to set it to hadoop
, which can be used as the username to launch COSRangerService in subsequent operations.examplebucket-1250000000
. It can be queried in the COS console.COSRangerService is the core of the permission system. It integrates with the Ranger client to receive the authentication requests, token generation and lease renewal requests, and temp key generation requests of the Ranger client. It is also where the sensitive information (Tencent Cloud keys) are stored. Usually, it is deployed on the jump server, and only the cluster admin can perform operations and query the configurations.
COSRangerService supports the one-master, multiple-slave HA deployment. DelegationToken
can be stored to HDFS, and the master and slave nodes can be determined by ZooKeeper lock obtaining. Then, the master node will write the address to Zookeeper so that COSRangerClient can perform address routing.
You can go to Github > cos-ranger-server to obtain the source code.
v1.1 or above
cos-ranger.xml
file. The following is the required modifications. For more information about the configuration items, please see the comments in the file.ranger-cos-security.xml
file. The following is the required modifications. For more information about the configuration items, please see the comments in the file.start_rpc_server.sh
file, modify the configuration of hadoop_conf_path
and java.library.path
, which corresponds to the directory of the Hadoop configuration files (for example, core-site.xml
and hdfs-site.xml
) and hadoop native libraries, respectively.chmod +x start_rpc_server.sh
nohup ./start_rpc_server.sh &> nohup.txt &
qcloud.object.storage.status.port
. Default value: 9998
). You can run the following command to obtain the status information, such as whether the leader is contained, and the authentication statistics:# Replace `10.xx.xx.xxx` with the IP address of the device deployed with the ranger service.
# Replace `9998` in the command with the value of `qcloud.object.storage.status.port`.in the configuration file.
curl -v http://10.xx.xx.xxx:9998/status
COSRangerClient is dynamically loaded by the Hadoop COSN plugin. It is a proxy that encapsulates all COSRangerService access requests, such as obtaining temp keys, obtaining tokens, and performing authentication.
You can go to Github > cos-ranger-client to obtain the source code.
v1.1 or above
Copy the cos-ranger-client JAR package to the same directory of COSN. The JAR package version should be consistent with the major version of your Hadoop.
Add the following configuration in core-site.xml
:
<configuration>
<!--*****Required Configuration********-->
<!-- ZooKeeper address, from which the client can obtain the ranger-service address -->
<property>
<name>qcloud.object.storage.zk.address</name>
<value>10.0.0.8:2121</value>
</property>
<!--***Optional Configuration****-->
<!-- Set the Kerberos credential used on the COSRangerService side. It should be the same as that configured on the COSRangerService side. If verification is not involved, you can skip this configuration. -->
<property>
<name>qcloud.object.storage.kerberos.principal</name>
<value>hadoop/_HOST@EMR-XXXX</value>
</property>
<!--***Optional Configuration****-->
<!-- IP address path of the Ranger server recorded in ZooKeeper. The default value is used herein. The value must the same as that configured in COSRangerService. -->
<property>
<name>qcloud.object.storage.zk.leader.ip.path</name>
<value>/ranger_qcloud_object_storage_leader_ip</value>
</property>
</configuration>
v5.9.0 or above
For detailed directions on the deployment of COSN, please see Hadoop. Please note the following:
fs.cosn.userinfo.secretId
and fs.cosn.userinfo.secretKey
does not need to be configured. COSN will obtain the temp key via COSRangerService.fs.cosn.credentials.provider
needs to be set to org.apache.hadoop.fs.auth.RangerCredentialsProvider
so that the verification and authentication can be performed via Ranger. The following is an example:<property>
<name>fs.cosn.credentials.provider</name>
<value>org.apache.hadoop.fs.auth.RangerCredentialsProvider</value>
</property>
# Replace the bucket, path, and other information with that of the root account.
hadoop fs -ls cosn://examplebucket-1250000000/doc
hadoop fs -put ./xxx.txt cosn://examplebucket-1250000000/doc/
hadoop fs -get cosn://examplebucket-1250000000/doc/exampleobject.txt
hadoop fs -rm cosn://examplebucket-1250000000/doc/exampleobject.txt
Kerberos meets the authentication needs. If the cluster and users are trusted, and the purpose of the authentication is only to avoid misoperations caused by unauthorized users, you can skip installing Kerberos and only use Ranger for authentication. As a matter of fact, Kerberos also compromises performance. Therefore, you can balance your needs for security and performance. If authentication is needed, you can enable Kerberos, and then configure COSRangerService and COSRangerClient.
If no policy is matched, the operation will be denied by default.
Yes. A sub-account with relevant permissions on the operated bucket can generate a temp key for the COSN plugin and operate relevant operations. Normally, you can grant all permissions of the bucket to the configured key.
The temp key is cached on the COSN side. It will be periodically updated asynchronously.
Was this page helpful?