-
Notifications
You must be signed in to change notification settings - Fork 29k
Remove characters added by IPython #35043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove characters added by IPython #35043
Conversation
When running `PYSPARK_DRIVER_PYTHON=ipython pyspark` the find-spark-home script calls `ipython /path/to/find_spark_home.py` and the string printed by that script gets assigned to SPARK_HOME. When run with IPython that string will start with a sequence bounded by control characters before the path determined by find_spark_home.py. While this part of the string does not appear on echo it will cause pyspark to compose paths improperly when using SPARK_HOME.
To see the sequence run:
>>> import os
>>> p = os.popen('ipython somescript.py')
>>> p.read()
'\x1b[22;0t\x1b]0;IPython: notebook/Python\x07the expected output\n'
The cut command removes the sequence before "the expected output". Lines without a bell character (\x07), such as you get when running `python3 find_spark_home.py`, remain unchanged.
| PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"python3"}" | ||
| fi | ||
| export SPARK_HOME=$($PYSPARK_DRIVER_PYTHON "$FIND_SPARK_HOME_PYTHON_SCRIPT") | ||
| export SPARK_HOME=$($PYSPARK_DRIVER_PYTHON "$FIND_SPARK_HOME_PYTHON_SCRIPT" | cut -d $'\007' -f 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it relate to #28256? Does it support jupyter too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears to be the same problem but with a different solution.
I hadn't tried it with jupyter before so I don't know what I should expect.
Running it with PYSPARK_DRIVER_PYTHON=jupyter made it attempt to run jupyter-/path/to/find_spark_home.py, which failed with it failing to find /bin/load-spark-env.sh or /bin/spark-submit.
Running it with PYSPARK_DRIVER_PYTHON='jupyter notebook' started the kernel but initially opened a 403 forbidden page in the browser, complaining of lacking a referrer. I pasted the URL with the token appearing in the jupyter log and it took me to the bin directory under the virtual environment.
It doesn't appear to me that $PYSPARK_DRIVER_PYTHON "$FIND_SPARK_HOME_PYTHON_SCRIPT" should run with PYSPARK_DRIVER_PYTHON set to "jupyter".
|
cc @holdenk FYI |
|
Can one of the admins verify this patch? |
|
(Could you make a JIRA for this?) |
|
Running ipython as the driver seems weird it's not really designed to be used this way. Can you elaborate on why your doing this/what your trying to accomplish? |
Running interactive shell sessions with IPython is quite common approach from my experience. |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
When running
PYSPARK_DRIVER_PYTHON=ipython pysparkon xterm the find-spark-home script callsipython /path/to/find_spark_home.pyand the string printed by that script gets assigned to SPARK_HOME. When run with IPython that string will start with a sequence bounded by control characters before the path determined by find_spark_home.py. While this part of the string does not appear on echo it will cause pyspark to compose paths improperly when using SPARK_HOME.To see the sequence run:
'\x1b[22;0t\x1b]0;IPython: {current directory}\x07the expected output\n'The cut command removes the sequence before "the expected output". Lines without a bell character (\x07), such as you get when running
python3 find_spark_home.py, remain unchanged.What changes were proposed in this pull request?
Fixing the assignment to SPARK_HOME in find-spark-home to remove the control characters added when using ipython.
Why are the changes needed?
On xterm running
PYSPARK_DRIVER_PYTHON=ipython pysparkcauses pyspark to compose paths improperly, prepending the current working directory to SPARK_HOME as determined by find_spark_home.py, making it unable to find the files it seeks.Does this PR introduce any user-facing change?
Yes. Before the change I would get "No such file or directory" errors as the current working directory would get prepended to SPARK_HOME. After the change the pyspark interactive prompt starts as expected with an ipython prompt.
How was this patch tested?
I ran pyspark with PYSPARK_DRIVER_PYTHON set to "python", "python3" and "ipython". All three variations gave the appropriate prompt with the expected session and context variables set. I also tested the pipe to the cut command with lines with and without bell characters to ensure that the addition had no effect on the latter. I didn't modify the current testing scheme because I couldn't find an extant test for any of the relevant bash scripts.