How to Use Remove Duplicates Node in n8n | Clean Your Data Fast

Learn how to use the Remove Duplicates node in n8n to efficiently clean your data by eliminating duplicate entries based on specific fields.

Table of Contents

Overview

In this lesson, you will learn how to use the Remove Duplicates node in n8n to clean up your data efficiently and maintain data integrity in your automation workflows. By following this tutorial, you'll understand how to configure this node to eliminate duplicate entries based on specific fields, optimize your workflow performance, and avoid sending redundant notifications or processing duplicate data.


Why Use the Remove Duplicates Node?

Duplicate data can cause various issues in your business workflows, such as:

  • Sending multiple emails to the same customer
  • Creating duplicated orders in your system
  • Generating inaccurate reports or dashboards

The Remove Duplicates node helps you maintain clean and unique data, which is essential for reliable automation processes.


Prerequisites

Before diving into this lesson, ensure you are familiar with the following n8n nodes and concepts covered in previous lessons:

  • Merge node
  • Aggregate node
  • Set (or Edit Fields) node
  • Switch node

If you haven't worked with these nodes yet, it is recommended to review those lessons first as they form the foundation for this tutorial. Also, understanding How Branching Works in n8n will help you manage different workflow paths effectively.


Step-by-Step Guide to Using the Remove Duplicates Node

1. Understanding Your Data Flow

In the example workflow, you have:

  • Orders data fetched and merged with order details
  • Priority set based on order age and status (processing and older than 45 days are high priority)
  • A Switch node branching orders by status: pending, processing, canceled, and refunded
  • Aggregation of orders to send consolidated notifications

Your goal is to remove duplicate orders before aggregation and notification.

2. Fixing the Priority Condition

Before removing duplicates, ensure your priority logic is correct:

  • Update the priority setting in the Set node to check both order status and order age.
  • Use an AND condition to set high priority only if:
    • The order status is "processing", and
    • The order creation date is more than 45 days ago.

Example expression for priority:

{{ 
  ($json["orderStatus"] === "processing") && 
  (new Date($json["orderDate"]) < new Date(Date.now() - 45 * 24 * 60 * 60 * 1000)) 
    ? "high" 
    : "standard" 
}}

This ensures only relevant orders are flagged as high priority.

3. Adding the Remove Duplicates Node

a. Insert the Remove Duplicates Node

  • Disconnect the existing connection from the Switch node to the Aggregate node.
  • Add a new node: Remove Duplicates.
  • Connect the output of the Switch node branch (e.g., pending orders) to the Remove Duplicates node.

b. Configure the Remove Duplicates Node

  • Choose the operation: Remove items repeated within the current input.
  • Set the Comparison mode:
    • Use Selected Fields.
    • Select the field(s) you want to check for duplicates. Typically, this is the orderId field.

Example configuration:

Parameter Value
Operation Remove duplicates within current input
Comparison Selected Fields
Selected Fields orderId
Remove Other Fields Enabled (optional)

Note: Enabling Remove Other Fields will keep only the selected fields (e.g., orderId) in the output, reducing unnecessary data passed forward.

c. Test the Node

  • Run the node to verify duplicates are removed.
  • You should see a reduced number of unique order IDs.

4. Aggregate Unique Orders

  • Connect the output of the Remove Duplicates node to the Aggregate node.
  • The Aggregate node will combine the unique orders into a single item for sending consolidated notifications.

5. Repeat for Other Branches

  • For each branch from the Switch node (e.g., processing, canceled, refunded), add separate Remove Duplicates nodes.
  • This prevents duplicates within each category and avoids mixing data from different branches.
  • After removing duplicates, aggregate each branch separately before sending notifications.

6. Merging Branches with Similar Actions

  • If multiple branches share the same final notification action (e.g., canceled and refunded orders both need email and Slack notifications), merge them with a Merge node before removing duplicates.
  • Then apply one Remove Duplicates node followed by aggregation and notification.
  • For more details on merging data streams, refer to the n8n Merge node documentation.

Example: Remove Duplicates Node JSON Configuration

Here is an example JSON snippet for the Remove Duplicates node configured to remove duplicates based on the orderId field and remove other fields:

{
  "nodes": [
    {
      "parameters": {
        "operation": "removeDuplicates",
        "comparison": "selectedFields",
        "fields": ["orderId"],
        "removeOtherFields": true
      },
      "name": "Remove Duplicates",
      "type": "n8n-nodes-base.removeDuplicates",
      "typeVersion": 1,
      "position": [600, 300]
    }
  ]
}

Common Mistakes and Troubleshooting

Mistake 1: Removing Duplicates on All Fields

  • If you select All Fields for comparison, no duplicates might be removed because each record is unique across all fields.
  • Always select specific fields that uniquely identify duplicates (e.g., orderId).

Mistake 2: Using a Single Remove Duplicates Node for Multiple Branches

  • Merging multiple branches before removing duplicates can cause unintended behavior.
  • Each branch should have its own Remove Duplicates node or be merged carefully based on your workflow logic.

Mistake 3: Forgetting to Update Downstream Nodes After Removing Fields

  • If you enable Remove Other Fields, downstream nodes that expect other fields (e.g., customer email, order status) may fail.
  • Update expressions and references in downstream nodes accordingly.
  • For handling errors that might arise from such conditions, consider reviewing Master Error Handling in n8n.

Troubleshooting Tips

  • Use the Execute Node feature to test nodes independently.
  • Check the output data structure after the Remove Duplicates node to confirm duplicates are removed as expected.
  • Review node connections and order of execution to ensure data flows correctly.

Best Practices

  • Use Remove Duplicates early in your workflow to minimize redundant processing.
  • Combine it with Aggregate nodes to consolidate notifications and reduce noise.
  • Keep your workflow organized by separating branches clearly and applying duplicates removal within each branch.
  • Regularly test complex conditions (like priority setting) with sample data.
  • When working with iterative data, consider using the Loop Over Items in n8n node to process items conditionally.

Additional Resources


Quick Reference Cheat Sheet

Step Node/Action Key Settings
Fetch and merge order data HTTP Request / Merge -
Set priority Set node Condition with AND: orderStatus == "processing" AND orderDate > 45 days
Branch orders by status Switch node Cases: pending, processing, canceled, refunded
Remove duplicates per branch Remove Duplicates node Operation: current input duplicates; Compare by orderId; Remove other fields: ON
Aggregate unique orders Aggregate node Group relevant fields; Output count or list
Send notifications Email / Slack nodes Use aggregated data; Update message content

By following this tutorial, you will be able to clean your business data efficiently with the Remove Duplicates node in n8n, ensuring your automations run smoothly without redundant or conflicting data. This is a crucial step in building robust and scalable automation workflows.

Frequently Asked Questions

In the Remove Duplicates node, select 'Remove items repeated within the current input' and choose 'Selected Fields' mode, then specify the field(s) like 'orderId' to check for duplicates.

No, the Remove Duplicates node removes duplicates only within the current input data it receives, so you need to merge or aggregate branches before applying it.

Removing duplicates before aggregation prevents redundant processing, avoids sending multiple notifications for the same item, and ensures accurate consolidated data.

Set priority flags using a Set node with conditions (e.g., order status and age), then pass filtered data through the Remove Duplicates node to clean duplicates only among relevant high or standard priority items.

If no specific fields are selected, the node compares entire items for duplicates, which might not be efficient or accurate if only certain fields determine duplicates.

Dheeraj Sharma

Dheeraj Sharma

AI Systems Builder
Creator of the n8n Zero to Hero course (42 lessons, 31+ hours). I help solopreneurs build AI systems that grow revenue without growing workload.

Get the n8n Mastery Bundle

All workflows, cheat sheets, and premium resources from the entire course in one package.

Get Premium Resources