From edd53828a23561144b535582430c0805fe54b355 Mon Sep 17 00:00:00 2001 From: Tucker Beck Date: Thu, 17 Aug 2017 15:20:47 -0700 Subject: [PATCH] Fixed pandoc dependency issue in python/setup.py MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When pyspark is listed as a dependency of another package, installing the other package will cause an install failure in pyspark. When the other package is being installed, pyspark's setup_requires requirements are installed including pypandoc. Thus, the exception handling on setup.py:152 does not work because the pypandoc module is indeed available. However, the pypandoc.convert() function fails if pandoc itself is not installed (in our use cases it is not). This raises an OSError that is not handled, and setup fails. This change simply adds an additional exception handler for the OSError that is raised. The following is a sample failure: $ which pandoc $ pip freeze | grep pypandoc pypandoc==1.4 $ pip install pyspark Collecting pyspark Downloading pyspark-2.2.0.post0.tar.gz (188.3MB) 100% |████████████████████████████████| 188.3MB 16.8MB/s Complete output from command python setup.py egg_info: Maybe try: sudo apt-get install pandoc See http://johnmacfarlane.net/pandoc/installing.html for installation options --------------------------------------------------------------- Traceback (most recent call last): File "", line 1, in File "/tmp/pip-build-mfnizcwa/pyspark/setup.py", line 151, in long_description = pypandoc.convert('README.md', 'rst') File "/home/tbeck/.virtualenvs/cem/lib/python3.5/site-packages/pypandoc/__init__.py", line 69, in convert outputfile=outputfile, filters=filters) File "/home/tbeck/.virtualenvs/cem/lib/python3.5/site-packages/pypandoc/__init__.py", line 260, in _convert_input _ensure_pandoc_path() File "/home/tbeck/.virtualenvs/cem/lib/python3.5/site-packages/pypandoc/__init__.py", line 544, in _ensure_pandoc_path raise OSError("No pandoc was found: either install pandoc and add it\n" OSError: No pandoc was found: either install pandoc and add it to your PATH or or call pypandoc.download_pandoc(...) or install pypandoc wheels with included pandoc. ---------------------------------------- Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-mfnizcwa/pyspark/ --- python/setup.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/python/setup.py b/python/setup.py index cfc83c68e3df5..02612ff8a7247 100644 --- a/python/setup.py +++ b/python/setup.py @@ -151,6 +151,8 @@ def _supports_symlinks(): long_description = pypandoc.convert('README.md', 'rst') except ImportError: print("Could not import pypandoc - required to package PySpark", file=sys.stderr) + except OSError: + print("Could not convert - pandoc is not installed", file=sys.stderr) setup( name='pyspark',