Amazon S3 - Giving MadKudu access to an S3 bucket

In the following tutorial, we will explain how to setup an S3 bucket and give us access to it securely.

S3 buckets can be used to exchange data with MadKudu. For example, you can send to MadKudu the data from your Data Warehouse (eg. Snowflake, Redshift, BigQuery...) or from other integrations.  You can also receive MadKudu data (eg. scores, predictions, segmentations...) in S3 to then export it to where you need it (eg. data warehouse or other integrations). 

If you already export Kissmetrics, Segment, or your own data to a S3 bucket, we can access data directly from that bucket.

If the bucket you want to share already exists, please skip the following section called Create an S3 Bucket.

Create an S3 Bucket

  1. Go to AWS Management Console

  2. Go to S3 service from Services > Storage > S3

  3. Create Bucket Amazon S3 > Create bucket

    • Fill in the form and make note of your bucket

    • You can use any bucket name, eg. my-madkudu-shared-bucket

Setting up the correct access to your S3 bucket

For MadKudu to access your S3 bucket, our preferred option is for you to grant access to your S3 bucket to a MadKudu IAM role.

To do this, you'll need MadKudu's AWS account ID and External ID to Create an IAM policy and Create an IAM role (see AWS documentation). 

 

Step 1. Get MadKudu AWS account ID and external ID

Visit app.madkudu.com > Integrations > Amazon S3 > Configuration to find MadKudu's account id and external id or use: 

  • Account ID: 203796963081

  • External ID: your MadKudu API key (see below where to find it in app.madkudu.com)

 

Step 2. Create an IAM Role for MadKudu

For MadKudu to pull data from your bucket, you'll need to grant read permissions (ListBucket, GetObject).

For MadKudu to push data back into your bucket, you'll need to grant write (and delete) permissions on top of that (PutObject, DeleteObject).

  1. Go to your AWS Management Console.

  2. Go to IAM Identity and Access Management from: Services > Security, Identity & Compliance > IAM.

  3. Go to Roles and Click on Create role. 


  4. Choose AWS account role as the Trusted entity type. For Account ID, type the MadKudu AWS account ID to which you want to grant access to your resources. 

     

  5. As you are granting permissions to users from an account that you do not control, and the users will assume this role programmatically, select Require external ID. (see more info)

  6. Enter the External ID you got in Step 1. 



    image

  7. Confirm that Require MFA is not selected.

  8. Click Next.

  9. On the "Add Permissions" menu, click Next.

  10. On the "Name, Review and Create role" menu

    • Use any role name you want but we suggest to include "madkudu" in it, for example  "integration-madkudu-s3-read

    • The trusted entities should already be generated based on what you inputted in the previous steps, on this model

      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "",
                  "Effect": "Allow",
                  "Principal": {
                      "AWS": "arn:aws:sts:203796963081:root"
                  },
                  "Action": "sts:AssumeRole",
                  "Condition": {
                      "StringEquals": {
                          "sts:ExternalId": "madkudu-integration-api-key"
                      }
                  }
              }
          ]
      }
  11. Click Create role.

  12. Select the MadKudu role you just created.

  13. Find the Role ARN and make note of it.

  14. Change the maximum session duration to 12 hours (This is to ensure that MadKudu will be able to extract data from your bucket without the credentials expiring in the middle).

  15. Go the Amazon S3 page in the MadKudu app:  app.madkudu.com > Integrations > Amazon S3 > Configuration and input & save:

    1. the Role ARN

    2. region of your Bucket

    3. bucket directory (with a folder path(s) if MadKudu can only access specific folders): this specifies the portion of the bucket in which you'd like MadKudu to pull data from. Any files under the specified folder and all of its nested subfolders will be examined for files we can upload. If no prefix is supplied, we'll look through the entire bucket for files to sync. mceclip0.png

Step 3. Create an IAM Inline Policy for MadKudu 

  • Go to the Role you just created, and choose the Permission tab.

  • Click Add Permissions > Create inline policy.image

  • Use the JSON editor and copy the following policy and paste it into the JSON tab, replacing bucket_name under Resource with the name of your s3 bucket. It accounts for the following permission:

    • List: ListBucket

    • Read: HeadObject

    • Read:GetObject -- this is needed for a pull from S3

    • (optional) Write:PutObject -- this is needed for a push to S3

    • (optional) Write: DeleteObject -- this is needed for a push to S3 (to delete test files)

{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": ["s3:Get*", "s3:List*"],
       "Resource": [
        "arn:aws:s3:::YOUR_BUCKET_NAME",
        "arn:aws:s3:::YOUR_BUCKET_NAME/*"
      ]
    }
  ]
}
  • Click Review Policy

  • Name the policy "MadKudu-S3-Access".

  • Click Create Policy.

 

Encryption

We recommend that you encrypt your data in the S3 bucket for increased protection. If your data in S3 is encrypted server-side, you would need to add a policy to let us use the encryption key. 

Step 1. Find your AWS KMS ARN

1.    Open the AWS KMS console, and find the AWS KMS ARN.  

Step 2. Update the role inline policy

2.    Open the IAM console,  update the IAM inline policy (created in Step 2 above) that grants the permissions to read from the bucket to work with the AWS KMS key that's associated with the bucket.

For the Resource value, enter the AWS KMS key's ARN.

    {
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:kms:example-region-1:123456789098:key/111aa2bb-333c-4d44-5555-a111bb2c33dd"
    }

 

The entire inline policy should look like this

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:Get*", "s3:List*"],
      "Resource": [
        "arn:aws:s3:::YOUR_BUCKET_NAME",
        "arn:aws:s3:::YOUR_BUCKET_NAME/*"
      ]
    },
    {
      "Action": ["kms:Decrypt", "kms:GenerateDataKey"],
      "Effect": "Allow",
      "Resource": "arn:aws:kms:example-region-1:123456789098:key/111aa2bb-333c-4d44-5555-a111bb2c33dd"
    }
  ]
}


Please open a ticket here if you are facing difficulties, or consult the F.A.Q.

 

If all of this sounds like gibberish, please forward it directly to your favorite developer :)