Wednesday, 19 May 2021

Jupyter notebook on EMR not printing output while code is running Pyspark

I am running a very, very simple script in a Jupyter pyspark notebook, but it is not printing results as it runs, it just spits out the output when it's done. Here is the code:

import time
import sys

for i in range(10):
    print(i)
    time.sleep(1)

This waits 10 seconds and then prints:

0
1
2
3
4
5
6
7
8
9

I would like to print results as they happen. I have tried to flush them using

for i in range(10):
    print(i)
    sys.stdout.flush()

and print(i, flush=True) to no avail. Any suggestions?



from Jupyter notebook on EMR not printing output while code is running Pyspark

No comments:

Post a Comment