pyspark.sql.DataFrame.union#
- DataFrame.union(other)[source]#
- Return a new - DataFramecontaining the union of rows in this and another- DataFrame.- New in version 2.0.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
 - See also - Notes - This method performs a SQL-style set union of the rows from both DataFrame objects, with no automatic deduplication of elements. - Use the distinct() method to perform deduplication of rows. - The method resolves columns by position (not by name), following the standard behavior in SQL. - Examples - Example 1: Combining two DataFrames with the same schema - >>> df1 = spark.createDataFrame([(1, 'A'), (2, 'B')], ['id', 'value']) >>> df2 = spark.createDataFrame([(3, 'C'), (4, 'D')], ['id', 'value']) >>> df3 = df1.union(df2) >>> df3.show() +---+-----+ | id|value| +---+-----+ | 1| A| | 2| B| | 3| C| | 4| D| +---+-----+ - Example 2: Combining two DataFrames with different schemas - >>> from pyspark.sql.functions import lit >>> df1 = spark.createDataFrame([(100001, 1), (100002, 2)], schema="id LONG, money INT") >>> df2 = spark.createDataFrame([(3, 100003), (4, 100003)], schema="money INT, id LONG") >>> df1 = df1.withColumn("age", lit(30)) >>> df2 = df2.withColumn("age", lit(40)) >>> df3 = df1.union(df2) >>> df3.show() +------+------+---+ | id| money|age| +------+------+---+ |100001| 1| 30| |100002| 2| 30| | 3|100003| 40| | 4|100003| 40| +------+------+---+ - Example 3: Combining two DataFrames with mismatched columns - >>> df1 = spark.createDataFrame([(1, 2)], ["A", "B"]) >>> df2 = spark.createDataFrame([(3, 4)], ["C", "D"]) >>> df3 = df1.union(df2) >>> df3.show() +---+---+ | A| B| +---+---+ | 1| 2| | 3| 4| +---+---+ - Example 4: Combining duplicate rows from two different DataFrames - >>> df1 = spark.createDataFrame([(1, 'A'), (2, 'B'), (3, 'C')], ['id', 'value']) >>> df2 = spark.createDataFrame([(3, 'C'), (4, 'D')], ['id', 'value']) >>> df3 = df1.union(df2).distinct().sort("id") >>> df3.show() +---+-----+ | id|value| +---+-----+ | 1| A| | 2| B| | 3| C| | 4| D| +---+-----+