Before we start...
An email address is composed of a local part and a domain: [local part]@[domain].
Filters applying to both local and domain parts of the email
Email contains "test"
If the email contains "test" pretty much anywhere, it is flagged as spam.
Some characters are repeated many times
If any character is repeated at least 4 times consecutively
OR
If any pair of letters is repeated at least 4 times (tetetete@gmail.com)
Spammy patterns detected!
If the email is long enough and the email's most common characters make up more than 70% of characters
Domain or local part contains blacklisted word or phrase
If the domain contains any phrase from an in-house list of blacklisted words/phrases (ex: "noemail" or "nothing")
Local part or domain length is 1
the local part (a@gmail.com) or domain contains exactly one character (logan@a.com)
Not F1000 or personal and has numbers in local
There are two consecutive numbers in the local part and the email domain does not belong to a Fortune1000 company
Filters applying to the email's domain
Domain end is domain
If the strings before and after the period in the domain are the same (logan@hello.hello)
Domain contains short gibberish
If the domain contains known gibberish patterns from a list. Examples:
asdef
asdf
etc.
Domain is considered disposable
Madkudu maintains a list of disposable domains. If the domain belongs to this list, the email is automatically flagged as spam. Ex: "yahooo", "randomail.net"
Filters applying to the email's local part
Numbers exceed letters
If there is at least one more digit than there are letters in the local part (1234aa@gmail.com)
OR
If there are at least 6 numbers in the local part
Local part has no letters
If the local part does not contain any letters
Local part has no vowels
If the local part of the string is at least 4 characters long, does not contain any numbers, and does not contain any vowels.
Local part low vowel ratio
If the local part is greater or equal to 5 characters and the fraction of vowels is very low (vowels / letters ratio)
Local part contains short gibberish (AI)
Madkudu uses a predictive model to detect gibberish patterns on groups of 4 letters (ex: dfgh). Every group of 4 letters is scored according to its similarity to known gibberish patterns from the training dataset. When the score reaches a certain threshold above average score, Madkudu flags the emails as a spam.