You'd like to use product usage or CRM data from a source MadKudu does not currently have an integration with? No worries, we can easily set up a transfer using Amazon S3 from Amazon Redshift or flat files (JSON or CSV). MadKudu's preferred way is to pull data from your S3 bucket where the data is formatted as described below, and from which MadKudu has access through an IAM role.
Please refer to this documentation to give MadKudu access to your bucket.
For transfer from Redshift, please refer to this documentation.
Note:
Depending on the volume of data to transfer, it may take from a few hours to a few weeks (>500M records) given the transfer rate limit between Amazon S3 to MadKudu rate limit. We recommend only sending the events necessary to configure MadKudu otherwise your implementation will be delayed until we get the full history of data. Please refer to this documentation or consult our implementation team at success@madkudu.com to understand what are relevant versus irrelevant events to configure your scoring.
Pre-requisites
You have access to an AWS account to create/manage an S3 bucket
How to format your data
MadKudu works with 3 types of objects:
Event: what are users doing?
Contact: who is the user? (coming soon)
Account: what accounts my users belong to? (coming soon)
Person level events
To send behavioral data (product usage, web activity, marketing activity...), create a file named event with the following attributes (with headers included):
Attribute | Format | Example | Description | |
---|---|---|---|---|
| required | String | "abc123" | A unique key identifying the event. If you do not have one, we suggest creating a combination of event_text + contact_key + event_timestamp |
| required | String | “signup”, “login”, “invited a friend” | The action taken by the user. |
| required | Unix time | “1436172703” | The time at which the event happened |
| required | String | "paul@madkudu.com" | The email address of the user who performed the action |
| optional | String or Numeric |
| properties describing the event (e.g. event_url for the url of visited page, event_form_title for the title of form submitted...) |
Example in JSON format
{"event_key": "abcd1234", "event_text":"signed up", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com"}
{"event_key": "abcd2345", "event_text":"visit web page", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com", "event_url":"http://www.domain.com/pricing"}
If you plan on sending event data from 2 or more sources, both event streams should be in the same file.
If you plan to have MadKudu pull your S3 data on a recurring basis, all custom properties columns (event_*
) must be communicated prior to setting up the recurring pull.
If you send events, please note that MadKudu needs to receive individual events, not aggregations.
Meaning MadKudu needs to receive this:
Event key | Event text | Event timestamps | |
---|---|---|---|
100 | Email click | 1/6/2023 0:00:00 | john@madkudu.com |
101 | Email click | 1/6/2023 0:00:00 | john@madkudu.com |
103 | Email click | 1/8/2023 0:00:00 | john@madkudu.com |
104 | Email click | 1/8/2023 0:00:00 | john@madkudu.com |
105 | Email click | 1/8/2023 0:00:00 | john@madkudu.com |
Instead of this:
Event key | Event text | Event timestamps | ||
100 | Number of email clicks | 1/6/2023 0:00:00 | john@madkudu.com | 2 |
101 | Number of email clicks | 1/8/2023 0:00:00 | john@madkudu.com | 3 |
Account level events
If you are sending account level events (de-anonymized website visits, 3rd part intent, etc) the same events format applies. To attach events to the respective account, MadKudu uses the domain. The events file needs to contain a ‘fake’ email address anonymous@domain.com
as contact_key. See details here.
Attribute | Format | Example | Description | |
---|---|---|---|---|
| required | String | "abc123" | A unique key identifying the event. If you do not have one, we suggest creating a combination of event_text + contact_key + event_timestamp |
| required | String | “signup”, “login”, “invited a friend” | The action taken by the user. |
| required | Unix time | “1436172703” | The time at which the event happened |
| required | String | "anonymous@madkudu.com" | The unique identifier of the visitor who showed intent. to create an email, you can append 'anonymous@' in front of each domain. |
| optional | String or Numeric |
| properties describing the event (e.g. event_url for the url of visited page, event_form_title for the title of form submitted...) |
Example in JSON format
{"event_key": "abcd1234", "event_text":"signed up", "event_timestamp":1234567890, "contact_key":"anonymous@madkudu.com"}
{"event_key": "abcd2345", "event_text":"visit web page", "event_timestamp":1234567890, "contact_key":"anonymous@madkudu.com", "event_url":"http://www.domain.com/pricing"}
Points of attention
All files should have a header. The bracket { } and single quote ' characters are not supported. Make sure to delete any of these before creating your files.
How to format the files
MadKudu currently supports two file formats:
Newline-delimited JSON (preferred)
CSV
Newline-delimited JSON
Our preferred format for upload is newline-delimited JSON, which is more standardized and less error-prone than CSV.
In this format, the different records are separated by the newline \n
character. Each line is a valid JSON object:
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com"}
{"event_text":"added a friend", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com", "some_other_event_field":"some_value"}
Escape any double quote "
in your data with a \
(e.g. replace "
with \"
) Incorrect
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234", "key": "val"ue"}
Correct
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234", "key": "val\"ue"}
CSV
We also support the .csv format, with the recommended format:
column names (header) in the first line
separator:
~
→ separate the value with~
(ex:abc~def~
) Please do not use,
or-
as it easily creates parsing issuesdelimiter:
"
→ this adds quotes around the values (abc -> "abc"
)line separator: line-break
\n
Points of attention
Delimit your values with " "
Remove all line break characters (for example
\n
) from your fields.Make sure the number of fields is the same for each line.
Escape your
"
characters by adding a second"
character in front of it (see here for details)
Incorrect
Values are not delimited by "
abc,cde,ef
Correct
"abc","cde","efg"
Incorrect
The "e is wrongfully formatted. A second " should be added before.
"abc","cd"e","efg"
Correct
"abc","cd""e","efg"
Using the UTF-8 encoding is useful to avoid any issues with special characters in the files.
Data validation
JSON line and CSV are relatively easy to corrupt (for example with "
or ,
characters in the data).
We will validate the data on our side and warn you of any corruption issues, but it helps a lot if you follow the format requested above.
Compression
Please note that the maximum size for a single JSON object is 4 MB.
To speed up the data upload part, we highly recommend that you compress your file with GZIP before uploading them to S3.
You can call your file whatever you want it (we recommend event, contact and account). However, please make sure to add the correct extension depending on your file format:
.json.gz for compressed JSON (recommended)
.json for uncompressed JSON
.csv.gz for compressed CSV
.csv for uncompressed CSV
Whichever format you choose, if you plan on having MadKudu pull your S3 data on a recurring basis, the file format has to remain the same.
How to store your file
We recommend that the files you want to share with MadKudu are in a dedicated folder and that you create an IAM policy and role for MadKudu to access these files.
You will also need to set up a recurring push of your data to this folder for MadKudu to score fresh data. This is done by creating distinct files, as described below.
File naming
In the S3 bucket, please upload data into separate folders by date and by objects
{object}/{year}/{month}/{day} where the objects are
event
contact
account
opportunity
MadKudu will pull the files on the date from the folder name. Files in a folder containing /2020/11/20/
will be pulled on November the 20th, 2020.
If you use the S3 API, simply “prefix” your destination file name. For example, uploading to "contact/2020/11/20/name_of_file.csv"
will add a file name name_of_file.csv to the contact folder.
Please use this recommended file naming and storing system in the bucket for MadKudu to be able to automatically pull any new file.
s3://bucket_name/object/year/month/day/name_of_file.csv
Compression
To speed up file transfer, you can compress files locally before transferring them to Amazon S3. If you want to compress your files, please use the GZIP compression method and use .gz or .gzip as your file extension (we currently don’t support other methods or other extensions).
Frequency: setting up a recurring push of data to MadKudu
We pull from your S3 once a day at 00:01am (midnight) UTC. Therefore we recommend you load new files before, like an hour before at 11pm UTC.
When uploading new files please use the recommended naming convention described here File naming.
If you plan on having MadKudu pull your S3 data on a recurring basis, the file folder and the file naming have to remain the same.
FAQ
I'm having an issue with S3 / I don't know how to use S3
Please open a ticket here and we will be happy to assist you.
Your file format doesn’t work for me. What do I do?
If you’re having any issues with the file format, please open a ticket here and we’ll be happy to help.
How often is the data refreshed?
As soon as you drop data into the S3 bucket, expect results to be updated in the Madkudu platform within 6 hours.
What would happen if I send the same event more than once - will it appear twice in MadKudu?
Our system will deduce the events based on contact_key / event_text / timestamp
. If you send the same event twice, only one will be kept:
If sent in two separate batches, only the most recent will be kept.
If sent in the same data batch, the first one in the file.