Cleaning 17 Years of Email

How do you automatically clean up 17 years' worth of email without losing your mind?

Gmail was launched on April 1, 2004 and I've had an account from almost day one! The amount of email that has been stored in there for absolutely free is, frankly, pretty astonishing.

But, slowly over time, it has started to fill up. Even though I still have over half of my free space available to me, I figured it was about time to start planning before I got myself into trouble! I needed to clean my email!

Warning!

If you are planning to follow some of the instructions/ideas here, be aware that there is a possibility that you could lose a valuable email. If you are concerned about this happening, then I suggest that you use Google Takeout to backup your email before doing anything here. Then, you will be able to recover that precious email you accidentally deleted.

Automating the Clean

I knew right from the start that going through every single email that I had received and deleting them one at a time just was not going to be an option. Even if I were able to assess each email and then delete it accordingly in 5 seconds, this would still take several days working around the clock.

There had to be a better way...

I immediately thought about automating the process using n8n. This way, I could configure the parameters that I wanted and let n8n do all the work!

First Attempt

My first try was a pretty straight forward attempt:

FIrst n8n Workflow

I simply used two Gmail nodes. The first retrieved all of the Gmail messages that were not in chats that had the word unsubscribe in it but not the words license, key or password.

This was accomplished by setting the following parameters in the node:

First Gmail node parameters

The Query parameter reads as follows:

-in:chats unsubscribe -license -key -password

This basically tells the node to ignore chats and look for the term unsubscribe (which will presumably be mailing lists and newsletters) but ignore emails if they have the term license, key or password as there is a good chance that these will contain keys for software that I have purchased.

This is the same search format that is used in the Gmail Search bar:

Using the Gmail Search Bar for Testing

This is very useful for testing to see if you are getting the right search results.

It was important to set the Format to IDs as this returns the least amount of information possible to n8n. Otherwise, the loading process will take too long and consume too much memory.

The second node simply tool all of the IDs from the first node and then deleted the email with that ID.

I then ran the workflow and waited...

...and waited...

...and waited...

Eventually, I gave up waiting and went about my day. When I came back a while later, the workflow had crashed when the Gmail delete node received a 500 error from the API. This was basically telling me that API encountered an error and crashed.

The other issue I was having was that I couldn't really get a good idea on the progress of the process.

So I made some changes...

Second Attempt

I figured I'd try to resolve both of these issues with one change.

I added a SplitInBatches node between the two Gmail nodes and configure it to process 100 emails at a time. This way, it will give the API a short rest between batches and I will see the batch count increment on the node with every batch that is completed.

The updated workflow looked like this:

Delete Gmail Workflow v. 2.0

This worked a lot better and the workflow finished successfully! You can copy this workflow below if you wish to use it:

{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [
        -40,
        240
      ]
    },
    {
      "parameters": {
        "resource": "message",
        "operation": "getAll",
        "returnAll": true,
        "additionalFields": {
          "format": "ids",
          "q": "-in:chats unsubscribe -license -key -password"
        }
      },
      "name": "Gmail",
      "type": "n8n-nodes-base.gmail",
      "typeVersion": 1,
      "position": [
        150,
        240
      ],
      "credentials": {
        "gmailOAuth2": "Gmail"
      }
    },
    {
      "parameters": {
        "resource": "message",
        "operation": "delete",
        "messageId": "={{$json[\"id\"]}}"
      },
      "name": "Delete Old Gmail",
      "type": "n8n-nodes-base.gmail",
      "typeVersion": 1,
      "position": [
        500,
        410
      ],
      "credentials": {
        "gmailOAuth2": "Gmail"
      }
    },
    {
      "parameters": {
        "batchSize": 100,
        "options": {}
      },
      "name": "SplitInBatches",
      "type": "n8n-nodes-base.splitInBatches",
      "typeVersion": 1,
      "position": [
        310,
        240
      ]
    }
  ],
  "connections": {
    "Start": {
      "main": [
        [
          {
            "node": "Gmail",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Gmail": {
      "main": [
        [
          {
            "node": "SplitInBatches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Delete Old Gmail": {
      "main": [
        [
          {
            "node": "SplitInBatches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "SplitInBatches": {
      "main": [
        [
          {
            "node": "Delete Old Gmail",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Gmail Cleaner Workflow Code (Copy and Paste into n8n)

Refining the Cleaner

The cleaner worked quite well, reducing my total storage by 1.24 GB (from 7.07 to 5.83 GB shared across Google Drive, Gmail and Google Photos)! That's a lot of email!

But, I believe that it can be even better by refining the Query parameter in the Gmail node. Here are some other useful values that you can set for the Query parameter:

  • Specific email address - from:name@example.com
  • Promotions category - category:promotions
  • Password resets - "password reset" OR "reset password"
  • Updates before a specific date - category:updates before:2021/1/1
  • Muted emails - label:muted
  • Emails older than a month - older_than:1m

For more search operators, you can find a full list in the Gmail Help.

Scheduling and Alerting

Now that I have figured out how to clean out my inbox, it's important to keep it that way by running the workflow on a regular basis.

To do this, I added a Cron node at the beginning of the workflow and configured it to run each day at midnight.

It is also important to be alerted when a change has occurred on one of your systems so I added two more nodes at the end of the workflow:

  1. Calculate the number of emails deleted.
  2. Send a message to Discord indicating the number of emails deleted.

My final workflow looks like this:

Final Gmail Cleaner Workflow

When the workflow runs, it sends a message to Discord that looks something like this:

Gmail Cleaner Message in Discord

Conclusion

Using an n8n workflow, it is rather trivial to search and remove unwanted emails from your Gmail account so that it is easier to manage and maintain.

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.