In Sparkflows, we can use the ‘Union All’ and ‘Union Distinct’ processors to achieve them. ‘Union All’ would result in an output dataset consisting of all rows from the incoming datasets. It would contain duplicates. ‘Union Distinct’ would remove duplicates.
For more information read the Sparkflows Documentation here:
Hey Chris,
In Sparkflows, we can use the ‘Union All’ and ‘Union Distinct’ processors to achieve them. ‘Union All’ would result in an output dataset consisting of all rows from the incoming datasets. It would contain duplicates. ‘Union Distinct’ would remove duplicates.
For more information read the Sparkflows Documentation here:
https://docs.sparkflows.io/en/latest/user-guide/data-preparation/join-union.html?highlight=union%20distinct#union-distinct
https://docs.sparkflows.io/en/latest/processors/05-JoinUnion/unionall.html?highlight=union%20distinct#union-all