COS can implement access over the HDFS protocol by enabling metadata acceleration for a bucket. Then, it will generate a mount target for the bucket. You can download the HDFS client from GitHub and enter the mount target information in the client to mount COS. This document describes how to mount a bucket with metadata acceleration enabled in a computing cluster.
Note:
- Hadoop-COS supports access to metadata acceleration buckets in the format of
cosn://bucketname-appid/
starting from v8.1.1.- The metadata acceleration feature can only be enabled during bucket creation and cannot be disabled once enabled. Therefore, carefully consider whether to enable it based on your business conditions. You should also note that legacy Hadoop-COS packages cannot access metadata acceleration buckets.
hadoop fs ls
to check whether the versions meet the requirements in the directory configured by fs.cosn.trsf.fs.ofs.tmp.cache.dir
.Download the Hadoop client tool installation package from GitHub.
Put the installation package in the classpath
of each node and make sure that the job can be started and loaded normally, such as $HADOOP_HOME/share/hadoop/common/lib/
.
Note:The EMR environment comes with JAR dependency packages, which don't need to be installed manually. You can directly access a metadata acceleration bucket through POSIX semantics. If you need to use the S3 protocol for access, change the
fs.cosn.posix_bucket.fs.impl
configuration item as detailed below.
Edit the core-site.xml
file by adding the following basic configuration items:
<!-- API key information of the account, which can be viewed in the [CAM console](https://console.intl.cloud.tencent.com/capi). -->
<property>
<name>fs.cosn.userinfo.secretId/secretKey</name>
<value>AKIDxxxxxxxxxxxxxxxxxxxxx</value>
</property>
<!-- COSN implementation class -->
<property>
<name>fs.AbstractFileSystem.cosn.impl</name>
<value>org.apache.hadoop.fs.CosN</value>
</property>
<!-- COSN implementation class -->
<property>
<name>fs.cosn.impl</name>
<value>org.apache.hadoop.fs.CosFileSystem</value>
</property>
<!-- Bucket region in the format of `ap-guangzhou` -->
<property>
<name>fs.cosn.bucket.region</name>
<value>ap-guangzhou</value>
</property>
<!-- Local temporary directory, which is used to store temporary files generated during execution. ->
<property>
<name>fs.cosn.tmp.dir</name>
<value>/tmp/hadoop_cos</value>
</property>
Sync core-site.xml
to all hadoop
nodes.
Note:For steps 3 and 4 in an EMR cluster, you can simply modify the HDFS configuration in the component management section in the EMR console.
Run the hadoop fs -ls cosn://${bucketname-appid}/
command on the hadoop fs
command line, where bucketname-appid
is the mount address, i.e., the bucket name. If the file list is displayed normally, the COS bucket has been successfully mounted.
You can also use other configuration items of hadoop
or use an mr
job to run data processing jobs in COS metadata acceleration buckets. For an mr
job, you can run -Dfs.defaultFS=ofs://${bucketname-appid}/
to change the default input/output file system of this job to the specified bucket.
Configuration Item | Content | Description |
---|---|---|
fs.cosn.userinfo.secretId/secretKey | A value in the format of AKIDxxxxxxxxxxxxxxxxxxxx |
Enter the API key information of your account, which can be viewed in the CAM console. |
fs.cosn.impl | org.apache.hadoop.fs.CosFileSystem | COSN implementation class for FileSystem , which is fixed. |
fs.AbstractFileSystem.cosn.impl | org.apache.hadoop.fs.CosN | COSN implementation class for AbstractFileSystem , which is fixed. |
fs.cosn.bucket.region | A value in the format of ap-beijing |
Enter the region information of the bucket to be accessed, such as ap-beijing and ap-guangzhou . For enumerated values, see Regions and Access Endpoints. This parameter is compatible with the legacy parameter fs.cosn.userinfo.region . |
fs.cosn.tmp.dir | /tmp/hadoop_cos by default |
Set an existing local directory, where temporary files generated during execution will be placed. Meanwhile, be sure to configure sufficient space and permissions for this directory on each node. |
Configuration Item | Content | Description |
---|---|---|
fs.cosn.posix_bucket.fs.impl | com.qcloud.chdfs.fs.CHDFSHadoopFileSystemAdapter | This parameter is fixed at com.qcloud.chdfs.fs.CHDFSHadoopFileSystemAdapter for the POSIX access mode (default mode) or org.apache.hadoop.fs.CosNFileSystem for the S3 access mode, respectively. |
fs.cosn.trsf.fs.AbstractFileSystem.ofs.impl | com.qcloud.chdfs.fs.CHDFSDelegateFSAdapter | Implementation class for metadata acceleration bucket access |
fs.cosn.trsf.fs.ofs.impl | com.qcloud.chdfs.fs.CHDFSHadoopFileSystemAdapter | Implementation class for metadata acceleration bucket access |
fs.cosn.trsf.fs.ofs.tmp.cache.dir | A value in the format of /data/emr/hdfs/tmp/posix-cosn/ |
Set an existing local directory such as /data/emr/hdfs/tmp/posix-cosn/ , where temporary files generated during execution will be placed. Meanwhile, be sure to configure sufficient space and permissions for this directory on each node. |
fs.cosn.trsf.fs.ofs.user.appid | A value in the format of 12500000000 |
Your appid , which is required. |
fs.cosn.trsf.fs.ofs.bucket.region | A value in the format of ap-beijing |
Your bucket region, which is required. |
Configuration Item | Content | Description |
---|---|---|
fs.cosn.posix_bucket.fs.impl | org.apache.hadoop.fs.CosNFileSystem | This parameter is fixed at com.qcloud.chdfs.fs.CHDFSHadoopFileSystemAdapter for the POSIX access mode (default mode) or org.apache.hadoop.fs.CosNFileSystem for the S3 access mode, respectively. |
fs.cosn.trsf.
prefix to such configuration items.
Apakah halaman ini membantu?