Example: Deep Learning

Last updated: 2020-12-09 09:13:11

    Quick Start

    This document describes how to write a multilayer perceptron (MLP) BP algorithm based on a Scikit-learn machine learning library to predict the probability of winning and losing between two football teams by modeling historical international football matches, team rankings, physical and skill metrics of players, and the FIFA 2018 group match results. Below are the detailed directions.

    Step 1. Make a custom image

    1. For creation steps, please see Creating Custom Images.
    2. Install the dependency package. Take CentOS 7.2 64-bit as an example:
      yum -y install gcc
      yum -y install python-devel
      yum -y install tkinter
      yum -y install python-pip
      pip install --upgrade pip
      pip install pandas
      pip install numpy
      pip install matplotlib
      pip install seaborn
      pip install sklearn
      pip install --upgrade python-dateutil

    Step 2. Download the application package

    Click here to download the application package and upload it to COS. Specify the COS endpoint of the package, BatchCompute will download the package to the CVM instance before the job starts and automatically decompress and execute it.

    Step 3. Create a "fifa-predict" task template

    1. Log in to the BatchCompute Console and select Task Template on the left sidebar.
    2. Select the target region at the top on the "Task Template" page and click Create.
    3. Click Create to enter the "New task template" page and create a template as shown below:
      • Name: fifa-predict.
      • Description: data training and prediction.
      • Compute environment type: select as needed. Auto compute environment is selected in this example.
      • Resource configuration: S2.SMALL1 (1 core 1 GB memory). Public network bandwidth is pay-as-you-go.
      • Image: custom image ID. Please select the one created in step 1.
      • Resource quantity: number of concurrent renderings, such as 3, which means to train 3 neural network models concurrently.
      • Timeout threshold and number of retry attempts: keep the default values.
    4. Click Next to configure the program information as shown below:
      • Execution method: PACKAGE.
      • Package address: using COS as an example: cos://barrygz-1251783334.cosgz.myqcloud.com/fifa/fifa.2018.tar.gz.
      • Stdout log: for more information on the format, please see Entering COS & CFS Paths.
      • Stderr log: same as Stdout log.
      • Command line: python predict.py "Japan" "Senegal".
        Team list: 'Russia', 'Saudi Arabia', 'Egypt', 'Uruguay', 'Portugal', 'Spain', 'Morocco', 'Iran', 'France', 'Australia', 'Peru', 'Denmark', 'Argentina', 'Iceland', 'Croatia', 'Nigeria', 'Brazil', 'Switzerland', 'Costa Rica', 'Serbia', 'Germany', 'Mexico', 'Sweden', 'Korea Republic', 'Belgium', 'Panama', 'Tunisia', 'England', 'Poland', 'Senegal', 'Colombia', 'Japan'.
    5. Skip the storage mapping configuration step and click Next.
    6. Preview the task's JSON file and click Save after confirming that everything is correct.

    Step 4. Create a "fifa-merge" task template

    1. Log in to the BatchCompute Console and select Task Template on the left sidebar.
    2. Select the target region at the top on the "Task Template" page and click Create.
    3. Click Create to enter the "New task template" page and create a template as shown below:
      • Name: fifa-merge.
      • Description: aggregation of prediction data.
      • Compute environment type: select as needed. Auto compute environment is selected in this example.
      • Resource configuration: S2.SMALL1 (1 core 1 GB memory). Public network bandwidth is pay-as-you-go.
      • Image: custom image ID. Please select the one created in step 1.
      • Resource quantity: 1.
      • Timeout threshold and number of retry attempts: keep the default values.
    4. Click Next to configure the program information as shown below:
      )
      • Execution method: PACKAGE.
      • Package address: using COS as an example: cos://barrygz-1251783334.cosgz.myqcloud.com/fifa/fifa.2018.tar.gz.
      • Stdout log: for more information on the format, please see Entering COS & CFS Paths.
      • Stderr log: same as Stdout log.
      • Command line: python merge.py /data.
    5. Click Next to configure the storage mapping as shown below:
      • Input path mapping > COS/CFS path: enter the Stdout log path of the "fifa-predict" template.
      • Input path mapping > Local path: /data.
    6. Preview the task's JSON file and click Save after confirming that everything is correct.

    Step 5. Submit a job

    1. Click Job on the left sidebar to enter the "Job" list page.
    2. On the "Job" list page, select the target region at the top and click Create.
    3. Enter the "New job" page and configure job information as shown below:
      • Job name: fifa.
      • Priority: default value.
      • Description: fifa 2018 model.
    4. Select the fifa-predict and fifa-merge tasks on the left on the task flow page and drag them to the canvas on the right. Click the fifa-predict task anchor and drag it to the fifa-merge task.
    5. Enable Task information on the right on the task flow page, confirm that the configuration is correct, and click Complete.
    6. Query the job running information. For more information, please see Information Query.
    7. Query the rendering result. For more information, please see Viewing Object Information.

    Subsequent Operations

    This document illustrates a simple machine learning job to demonstrate basic BatchCompute capabilities. You can continue to test the advanced capabilities of BatchCompute as instructed in the Console User Guide.

    • Various CVM configurations: BatchCompute provides a variety of CVM configuration options. You can customize your own CVM configuration based on your business scenario.
    • Remote storage mapping: BatchCompute optimizes storage access and simplifies access to remote storage services into operations in the local file system.
    • Concurrent multi-model training: With BatchCompute, you can specify the number of concurrent instances and use environment variables to separate instances and read different training data for concurrent modeling.