Which version of python supports Apache Spark libraries (2 or 3)? If it supports both versions, are there any performance considerations for using python 2 or 3 when using Apache-Spark?
At least since Spark 1.2.1, the default version of Python is 2.7, unless otherwise used PYSPARK_PYTHONor PYSPARK_DRIVER_PYTHON(see bin/pyspark).
PYSPARK_PYTHON
PYSPARK_DRIVER_PYTHON
bin/pyspark
Python 3 is supported with Spark 1.4.0 (see SPARK-4897 and Spark 1.4.0 Release Notes ).
. , Python 2 Python 3 ?, . , , , SO.