Overview
In this lesson, you will learn how to use the Remove Duplicates node in n8n to clean up your data efficiently and maintain data integrity in your automation workflows. By following this tutorial, you'll understand how to configure this node to eliminate duplicate entries based on specific fields, optimize your workflow performance, and avoid sending redundant notifications or processing duplicate data.
Why Use the Remove Duplicates Node?
Duplicate data can cause various issues in your business workflows, such as:
- Sending multiple emails to the same customer
- Creating duplicated orders in your system
- Generating inaccurate reports or dashboards
The Remove Duplicates node helps you maintain clean and unique data, which is essential for reliable automation processes.
Prerequisites
Before diving into this lesson, ensure you are familiar with the following n8n nodes and concepts covered in previous lessons:
MergenodeAggregatenodeSet(or Edit Fields) nodeSwitchnode
If you haven't worked with these nodes yet, it is recommended to review those lessons first as they form the foundation for this tutorial. Also, understanding How Branching Works in n8n will help you manage different workflow paths effectively.
Step-by-Step Guide to Using the Remove Duplicates Node
1. Understanding Your Data Flow
In the example workflow, you have:
- Orders data fetched and merged with order details
- Priority set based on order age and status (
processingand older than 45 days are high priority) - A
Switchnode branching orders by status: pending, processing, canceled, and refunded - Aggregation of orders to send consolidated notifications
Your goal is to remove duplicate orders before aggregation and notification.
2. Fixing the Priority Condition
Before removing duplicates, ensure your priority logic is correct:
- Update the priority setting in the
Setnode to check both order status and order age. - Use an AND condition to set high priority only if:
- The order status is
"processing", and - The order creation date is more than 45 days ago.
- The order status is
Example expression for priority:
{{
($json["orderStatus"] === "processing") &&
(new Date($json["orderDate"]) < new Date(Date.now() - 45 * 24 * 60 * 60 * 1000))
? "high"
: "standard"
}}
This ensures only relevant orders are flagged as high priority.
3. Adding the Remove Duplicates Node
a. Insert the Remove Duplicates Node
- Disconnect the existing connection from the
Switchnode to theAggregatenode. - Add a new node: Remove Duplicates.
- Connect the output of the
Switchnode branch (e.g., pending orders) to the Remove Duplicates node.
b. Configure the Remove Duplicates Node
- Choose the operation: Remove items repeated within the current input.
- Set the Comparison mode:
- Use Selected Fields.
- Select the field(s) you want to check for duplicates. Typically, this is the
orderIdfield.
Example configuration:
| Parameter | Value |
|---|---|
| Operation | Remove duplicates within current input |
| Comparison | Selected Fields |
| Selected Fields | orderId |
| Remove Other Fields | Enabled (optional) |
Note: Enabling Remove Other Fields will keep only the selected fields (e.g., orderId) in the output, reducing unnecessary data passed forward.
c. Test the Node
- Run the node to verify duplicates are removed.
- You should see a reduced number of unique order IDs.
4. Aggregate Unique Orders
- Connect the output of the Remove Duplicates node to the
Aggregatenode. - The
Aggregatenode will combine the unique orders into a single item for sending consolidated notifications.
5. Repeat for Other Branches
- For each branch from the
Switchnode (e.g., processing, canceled, refunded), add separate Remove Duplicates nodes. - This prevents duplicates within each category and avoids mixing data from different branches.
- After removing duplicates, aggregate each branch separately before sending notifications.
6. Merging Branches with Similar Actions
- If multiple branches share the same final notification action (e.g., canceled and refunded orders both need email and Slack notifications), merge them with a
Mergenode before removing duplicates. - Then apply one Remove Duplicates node followed by aggregation and notification.
- For more details on merging data streams, refer to the n8n Merge node documentation.
Example: Remove Duplicates Node JSON Configuration
Here is an example JSON snippet for the Remove Duplicates node configured to remove duplicates based on the orderId field and remove other fields:
{
"nodes": [
{
"parameters": {
"operation": "removeDuplicates",
"comparison": "selectedFields",
"fields": ["orderId"],
"removeOtherFields": true
},
"name": "Remove Duplicates",
"type": "n8n-nodes-base.removeDuplicates",
"typeVersion": 1,
"position": [600, 300]
}
]
}
Common Mistakes and Troubleshooting
Mistake 1: Removing Duplicates on All Fields
- If you select All Fields for comparison, no duplicates might be removed because each record is unique across all fields.
- Always select specific fields that uniquely identify duplicates (e.g.,
orderId).
Mistake 2: Using a Single Remove Duplicates Node for Multiple Branches
- Merging multiple branches before removing duplicates can cause unintended behavior.
- Each branch should have its own Remove Duplicates node or be merged carefully based on your workflow logic.
Mistake 3: Forgetting to Update Downstream Nodes After Removing Fields
- If you enable Remove Other Fields, downstream nodes that expect other fields (e.g., customer email, order status) may fail.
- Update expressions and references in downstream nodes accordingly.
- For handling errors that might arise from such conditions, consider reviewing Master Error Handling in n8n.
Troubleshooting Tips
- Use the Execute Node feature to test nodes independently.
- Check the output data structure after the Remove Duplicates node to confirm duplicates are removed as expected.
- Review node connections and order of execution to ensure data flows correctly.
Best Practices
- Use Remove Duplicates early in your workflow to minimize redundant processing.
- Combine it with
Aggregatenodes to consolidate notifications and reduce noise. - Keep your workflow organized by separating branches clearly and applying duplicates removal within each branch.
- Regularly test complex conditions (like priority setting) with sample data.
- When working with iterative data, consider using the Loop Over Items in n8n node to process items conditionally.
Additional Resources
Quick Reference Cheat Sheet
| Step | Node/Action | Key Settings |
|---|---|---|
| Fetch and merge order data | HTTP Request / Merge |
- |
| Set priority | Set node |
Condition with AND: orderStatus == "processing" AND orderDate > 45 days |
| Branch orders by status | Switch node |
Cases: pending, processing, canceled, refunded |
| Remove duplicates per branch | Remove Duplicates node |
Operation: current input duplicates; Compare by orderId; Remove other fields: ON |
| Aggregate unique orders | Aggregate node |
Group relevant fields; Output count or list |
| Send notifications | Email / Slack nodes |
Use aggregated data; Update message content |
By following this tutorial, you will be able to clean your business data efficiently with the Remove Duplicates node in n8n, ensuring your automations run smoothly without redundant or conflicting data. This is a crucial step in building robust and scalable automation workflows.