In Sparkflows, there is a "Filter Unique" node that can perform the exact operation you're looking for. You can specify the column(s) based on which you want to determine uniqueness. Simply add this node to your input dataset and specify the column(s) you want to be unique. As a result, you will get two dataframes: the lower edge will contain all the unique data, while the higher edge will contain the duplicate records that were found.
Hey Nagisa,
In Sparkflows, there is a "Filter Unique" node that can perform the exact operation you're looking for. You can specify the column(s) based on which you want to determine uniqueness. Simply add this node to your input dataset and specify the column(s) you want to be unique. As a result, you will get two dataframes: the lower edge will contain all the unique data, while the higher edge will contain the duplicate records that were found.