pyspark.sql.functions.arrays_zip¶
- 
pyspark.sql.functions.arrays_zip(*cols: ColumnOrName) → pyspark.sql.column.Column[source]¶
- Collection function: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. If one of the arrays is shorter than others then resulting struct type value will be a null for missing elements. - New in version 2.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Examples - >>> from pyspark.sql.functions import arrays_zip >>> df = spark.createDataFrame([([1, 2, 3], [2, 4, 6], [3, 6])], ['vals1', 'vals2', 'vals3']) >>> df = df.select(arrays_zip(df.vals1, df.vals2, df.vals3).alias('zipped')) >>> df.show(truncate=False) +------------------------------------+ |zipped | +------------------------------------+ |[{1, 2, 3}, {2, 4, 6}, {3, 6, NULL}]| +------------------------------------+ >>> df.printSchema() root |-- zipped: array (nullable = true) | |-- element: struct (containsNull = false) | | |-- vals1: long (nullable = true) | | |-- vals2: long (nullable = true) | | |-- vals3: long (nullable = true)