tencent cloud

Feedback

COS Migration

Last updated: 2024-04-30 16:42:14

    Feature Overview

    COS Migration is an all-in-one tool that integrates the COS data migration feature. You can migrate local data to COS through simple configurations and steps. It has the following features:
    Checkpoint restart: Restarting uploads from checkpoints is supported. For large files, if the upload exits halfway or service failure occurs, you can run the tool again to restart the upload.
    Multipart upload: An object can be uploaded to COS by parts.
    Parallel upload: Multiple objects can be uploaded at the same time.
    Note:
    COS Migration only supports UTF-8 encoding.
    If you use this tool to upload a file that already has the same name, the existing file will be overwritten. You need to configure the tool to skip files with the same name.
    Use the migration service platform preferably for scenarios other than local data migration.
    COS Migration is used for one-time migration but is not suitable for continuous sync. For example, if files are added locally every day and need to be continuously synced to COS, then in order to avoid repeated migration tasks, COS Migration will save the records of successful migrations. In case of continuous sync, the record scanning time will keep increasing. We recommend you use COSBrowser as described in User Guide for Desktop Version for this scenario.

    Operating Environments

    Operating system

    Windows and Linux.

    Software dependency

    JDK 1.8 X64 or above. For more information, see Java Installation and Configuration.
    IFUNC needs to be supported on Linux and the binutils version should be later than 2.20.

    How to Use

    1. Get the tool

    Download COS Migration here.

    2. Decompress the package

    Windows

    Decompress the package and save it to a directory, for example:
    C:\\Users\\Administrator\\Downloads\\cos_migrate

    Linux

    Decompress the package and save it to a directory, for example:
    unzip cos_migrate_tool_v5-master.zip && cd cos_migrate_tool_v5-master

    Migration tool structure

    The structure of the properly decompressed COS Migration tool is as follows:
    COS_Migrate_tool
    |——conf #Directory of the configuration file
    | |——config.ini #Migration configuration file
    |——db #Store the record of successful migrations
    |——dep #JAR package complied by the main logic of the program
    |——log #Log generated during tool execution
    |——opbin #Script for compiling
    |——result #A directory used to save successful migration records. The record file is named "date.out". The format is "absolute path\\tfile size\\tlast modified time".
    |——src #Source code of the tool
    |——tmp #Temporary file storage directory
    |——.gitignore #Files and folders ignored by the Git version controller.
    |——pom.xml #Project configuration file
    |——README #Readme document
    |——start_migrate.sh #Migration startup script for Linux
    |——start_migrate.bat #Migration startup script for Windows
    Note:
    The db directory mainly records the IDs of files successfully migrated by the tool. Each migration job will first compare the records in the db directory. If the ID of the current file has already been recorded, the current file will be skipped, otherwise it will be migrated.
    The log directory keeps all the logs generated during tool migration. If an error occurs during migration, first check error.log in this directory.

    3. Modify the config.ini file

    Before running the migration startup script, modify the config.ini file (path: ./conf/config.ini) first. This file contains the following parts:

    3.1 Configure the migration type

    type indicates the migration type, which is fixed to type=migrateLocal.
    [migrateType]
    type=migrateLocal

    3.2 Configure the migration job

    You can configure a migration job based on your actual needs, including information configuration for the destination COS and job-related configurations.
    # The common configuration section of the migration tool includes account information to be migrated to the destination COS.
    [common]
    secretId=COS_SECRETID
    secretKey=COS_SECRETKEY
    bucketName=examplebucket-1250000000
    region=ap-guangzhou
    storageClass=Standard
    cosPath=/
    https=off
    tmpFolder=./tmp
    smallFileThreshold=5242880
    smallFileExecutorNum=64
    bigFileExecutorNum=8
    entireFileMd5Attached=on
    executeTimeWindow=00:00,24:00
    outputFinishedFileFolder=./result
    resume=false
    skipSamePath=false
    requestTryCount=5
    Name
    Description
    Default Value
    secretId
    SecretId of your key. Replace COS_SECRETID with your real key information, which can be obtained on the TencentCloud API key page in the CAM console
    -
    secretKey
    SecretKey of your key. Replace COS_SECRETKEY with your real key information, which can be obtained on the TencentCloud API key page in the CAM console
    -
    bucketName
    Name of the destination bucket in the format of <BucketName-APPID>. The bucket name must include the APPID such as examplebucket-1250000000
    -
    region
    Region information of the destination bucket. For the region abbreviations in COS, see Regions and Access Domain Names
    -
    storageClass
    Storage class for the migrated data. Valid values: Standard, Standard_IA, Archive. For more information, see Storage Class Overview.
    Standard
    cosPath
    COS path to migrate to. / indicates to migrate to the root path of the bucket, /folder/doc/ indicates to migrate to /folder/doc/ in the bucket. If /folder/doc/ does not exist, a path will be created automatically
    /
    https
    Whether to transfer via HTTPS. on: Yes, off: No. It takes time to enable transfer via HTTPS, which is suitable for scenarios that demand high security.
    off
    tmpFolder
    The directory used to store temporary files when data is migrated from another cloud storage service to COS, which will be deleted after the migration is completed. The format must be an absolute path: The separator on Linux is /, such as /a/b/cThe separator on Windows is \\, such as E:\\\\a\\\\b\\\\cThe default value is the tmp directory in the path of the tool
    ./tmp
    smallFileThreshold
    Number of bytes as the threshold for small files. If the size is greater than or equal to this threshold, multipart upload is used; otherwise, simple upload is used. The default value is 5 MB(5242880 Byte)
    5242880
    smallFileExecutorNum
    Concurrency for uploading small files (smaller than smallFileThreshold) via simple upload. Decrease the concurrency if files are uploaded to COS via public network with low bandwidth
    64
    bigFileExecutorNum
    Concurrency for uploading large files (greater than or equal to smallFileThreshold) via multipart upload. Decrease the concurrency if files are uploaded to COS via public network with low bandwidth
    8
    entireFileMd5Attached
    The migration tool calculates the MD5 of the entire file and stores it in the custom header "x-cos-meta-md5" of the file for subsequent verification, because the ETag of a large file uploaded to COS via multipart upload is not the MD5 of the entire file
    on
    executeTimeWindow
    Execution time window, at a granularity of minutes. This parameter defines the daily execution time range of the migration tool. For example: The parameter "03:30,21:00" means that tasks are executed between 03:30 and 21:00. Outside these hours, the migration tasks enter a sleep state. In sleep state, migration is paused but the progress is retained, and migration is automatically resumed at the next time window. Note that the end time point must be later than the start time.
    00:00,24:00
    outputFinishedFileFolder
    This directory stores the results of successful migration tasks, and result files are named by date, for example, ./result/2021-05-27.out, where ./result is the directory that is created. Each line in the result files is in the format of "Absolute path"\\t"File size"\\t"Last modified time". If outputFinishedFileFolder is left empty, no results will be output.
    ./result
    resume
    Whether to continue with the result of the last run and traverse through the list of files from the source. The tool starts from scratch by default.
    false
    skipSamePath
    Whether to skip the current file if a file with the same name already exists in COS. By default, the tool does not skip the current file: it overwrites the existing file.
    false
    requestTryCount
    Total number of attempts for each file upload.
    5

    3.3 Configure the data source

    3.3.1 Configure a local data source migrateLocal
    If you migrate from a local system to COS, configure this section. The specific configuration items and descriptions are as follows:
    # Configuration section for migration from a local system to COS
    [migrateLocal]
    localPath=E:\\\\code\\\\java\\\\workspace\\\\cos_migrate_tool\\\\test_data
    excludes=
    ignoreModifiedTimeLessThanSeconds=
    Configuration Item
    Description
    localPath
    Absolute path of the local directory
    Linux uses a slash (/) as the delimiter, for example, /a/b/c.
    Windows uses two backlashes (\\) as the delimiter, for example, E:\\\\a\\\\b\\\\c.
    Note: You can enter only a directory path but not file path for this parameter; otherwise, an error will occur while parsing the target object name. In the case of cosPath=/, the request will be incorrectly parsed into a bucket creation request.
    excludes
    Absolute path of the directory or file to be excluded, meaning some directories or files under localPath are not to be migrated. Multiple absolute paths are separated by semicolons. If this is left blank, all files in localPath will be migrated
    ignoreModifiedTimeLessThanSeconds
    Exclude files that have an update time less than a certain period of time from the current time (in seconds). This item is left blank by default, indicating files are not to be filtered by the time specified by lastmodified. It is suitable for scenarios where you run the migration tool while updating files and don't want files being updated to be migrated to COS. For example, if it is configured as 300, only files updated at least 5 minutes ago will be uploaded.

    4. Run the migration tool

    Windows

    Double-click start_migrate.bat to run the tool

    Linux

    1. Read the configuration from the config.ini file by running the following command:
    sh start_migrate.sh
    2. Read the configuration from command lines for some parameters by running the following command:
    sh start_migrate.sh -Dcommon.cosPath=/savepoint0403_10/
    Note:
    The tool supports reading configuration items in two ways: command line or configuration file.
    The command line takes priority over the configuration file, i.e., for the same configuration item, parameters in command lines take priority.
    Reading configuration items from command lines allows users to run different migration jobs at the same time, provided that key configuration items (such as bucket name, COS path, source path to be migrated, etc.) in the two jobs are not exactly the same. Concurrent migration can be achieved because different migration jobs are written into different db directories. Refer to db information in the tool structure above.
    Configuration items are in the format of -D{sectionName}.{sectionKey}={sectionValue}. sectionName is the section name of the configuration file. sectionKey is the name of the configuration item in the section. sectionValue is the value of the configuration item in the section. COS path to which data is migrated to should be in the format of -Dcommon.cosPath=/bbb/ddd.

    Migration mechanism and process

    Migration mechanism

    The COS migration tool is stateful. Files successfully migrated are recorded in the db directory, stored in the LevelDB files in the form of KV. Before each migration task check whether the path exists in the db directory. If the path is available and "mtime" is consistent with that in the db directory, the migration task is skipped. Otherwise, the migration task proceeds. Therefore, you need to check whether there are successful migration records in the db directory, rather than searching in COS. If the migration tool is bypassed and files are deleted or modified through other means (such as COSCMD or the console), such change will not be detected after the migration tool runs and migration will not be performed again.

    Migration process

    1. The configuration file is read, the corresponding configuration section is read according to the migration type, and parameters are checked.
    2. Scan and compare the identifiers for the files to be migrated in the db directory to check whether the files can be uploaded.
    3. During the migration execution process, the execution result is printed, where "inprogress" means migrating, "skip" means skipped, "fail" means failure, "ok" means success, and "condition_not_match" indicates files skipped due to not meeting migration conditions (such as lastmodified and excludes). Details of the failure can be found in the error log of the log.
    4. Statistics are printed out after the migration is completed, which include the total number of migrated, failed, and skipped files as well as the amount of time consumed. For failures, check the error log, or rerun the migration job as the migration tool will skip successfully migrated files and retry migrating failed ones. The execution result of a migration job is shown below:
    
    

    FAQs

    If an exception such as migration failure or execution error occurs when you use the COS Migration, troubleshoot as instructed in COS Migration.

    Conclusion

    Besides, COS offers not only the aforementioned applications and services but also a variety of popular open-source applications integrated with Tencent Cloud COS plugins. You are welcome to click here to launch them instantly with a single click.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support