pyspark.sql.functions.levenshtein#
- pyspark.sql.functions.levenshtein(left, right, threshold=None)[source]#
- Computes the Levenshtein distance of the two given strings. - New in version 1.5.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
- Column
- Levenshtein distance as integer value. 
 
 - Examples - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([('kitten', 'sitting',)], ['l', 'r']) >>> df.select('*', sf.levenshtein('l', 'r')).show() +------+-------+-----------------+ | l| r|levenshtein(l, r)| +------+-------+-----------------+ |kitten|sitting| 3| +------+-------+-----------------+ - >>> df.select('*', sf.levenshtein(df.l, df.r, 2)).show() +------+-------+--------------------+ | l| r|levenshtein(l, r, 2)| +------+-------+--------------------+ |kitten|sitting| -1| +------+-------+--------------------+