I've been trying to setup PYSPARK_PYTHON from a juputer notebook(using jupyter lab) to use a specific conda env but i cannot find a way to make it work, I have found some examples using:
import os
os.environ['PYSPARK_PYTHON'] = "<the path>"
But it did not work so I also tried:
spark = pyspark.sql.SparkSession.builder \
.master("yarn-client") \
.appName(session_name) \
.config("spark.yarn.appMasterEnv.PYSPARK_PYTHON","<the path>") \
.enableHiveSupport() \
.getOrCreate(cluster=cluster)
sc = spark.sparkContext
sqlContext = SQLContext(sc)
But it never uses the specified python version in the specified path , question is, is it possible the config is being ignored? do something else needs to be done in notebook?
I'm using yarn-client mode
from PYSPARK_PYTHON setup in jupyter notebooks is ignored
No comments:
Post a Comment