pyspark.sql.functions.xxhash64#
- pyspark.sql.functions.xxhash64(*cols)[source]#
- Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42. - New in version 3.0.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colsColumnor str
- one or more columns to compute on. 
 
- cols
- Returns
- Column
- hash value as long column. 
 
 - Examples - >>> df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2']) - Hash for one column - >>> df.select(xxhash64('c1').alias('hash')).show() +-------------------+ | hash| +-------------------+ |4105715581806190027| +-------------------+ - Two or more columns - >>> df.select(xxhash64('c1', 'c2').alias('hash')).show() +-------------------+ | hash| +-------------------+ |3233247871021311208| +-------------------+