---
title: "How does HG's spam detector work?"
slug: "how-does-madkudus-spam-detector-work"
updated: 2025-11-05T03:42:33Z
published: 2025-11-05T03:42:33Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://help.madkudu.com/llms.txt
> Use this file to discover all available pages before exploring further.

# How does HG's spam detector work?

# **Before we start...**

An email address is composed of a local part and a domain: [local part]@[domain].

# **Filters applying to both local and domain parts of the email**

## Email contains "test"

If the email contains "test" pretty much anywhere, it is flagged as spam.

## Some characters are repeated many times

If any character is repeated at least 4 times consecutively

OR

If any pair of letters is repeated at least 4 times ([tetetete@gmail.com](mailto:tetetete@gmail.com))

## Spammy patterns detected!

If the email is long enough and the email's most common characters make up more than 70% of characters

## Domain or local part contains blacklisted word or phrase

If the domain contains any phrase from an in-house list of blacklisted words/phrases (ex: "noemail" or "nothing")

## Local part or domain length is 1

the local part ([a@gmail.com](mailto:a@gmail.com)) or domain contains exactly one character (logan@a.com)

## Not F1000 or personal and has numbers in local

There are two consecutive numbers in the local part and the email domain does not belong to a Fortune1000 company

# **Filters applying to the email's domain**

## Domain end is domain

If the strings before and after the period in the domain are the same (logan@hello.hello)

## Domain contains short gibberish

If the domain contains known gibberish patterns from a list. Examples:

asdef

asdf

etc.

## Domain is considered disposable

HG Insights maintains a list of disposable domains. If the domain belongs to this list, the email is automatically flagged as spam. Ex: "yahooo", "randomail.net"

# **Filters applying to the email's local part**

## Numbers exceed letters

If there is at least one more digit than there are letters in the local part ([1234aa@gmail.com](mailto:1234aa@gmail.com))

OR

If there are at least 6 numbers in the local part

## Local part has no letters

If the local part does not contain any letters

## Local part has no vowels

If the local part of the string is at least 4 characters long, does not contain any numbers, and does not contain any vowels.

## Local part low vowel ratio

If the local part is greater or equal to 5 characters and the fraction of vowels is very low (vowels / letters ratio)

## Local part contains short gibberish (AI)

HG Insights uses a predictive model to detect gibberish patterns on groups of 4 letters (ex: dfgh). Every group of 4 letters is scored according to its similarity to known gibberish patterns from the training dataset. When the score reaches a certain threshold above average score, HG Insights flags the emails as a spam.
