Blog, UTC

Python & Node.js in Linux Userspace

Bootstrapping a Combined Virtual Environment for R&D

In this blog article I demonstrate how to bootstrap a combined Python and Node.js virtual environment completely in userspace on Linux. Root privileges might only be required for installing system-level dependencies. The described setup serves as my current fundamental baseline for development and data analysis work. It is based on Python 3.4 (CPython), virtualenv, nodeevn, Node.js 6.2, PyQt4, numpy, matplotlib, pymongo and h5py. In runs on top of both 32 bit (x86) and 64 bit (x86_64) openSUSE 13.1 Linux (now, as of early 2016, in long-term "Evergreen" support). The following text might be applicable to other versions of openSUSE or, in general, other Linux distributions, though package names and versions might differ.

In: Data Analysis, Software Development, Python, Python 3, Node.js, virtualenv, nodeenv, PyQt, PyQt4, JavaScript, Linux
 

Why should you care?

There are many reasons why this approach makes sense, though all ultimately depends on your requirements. In my case, as far as Python is concerned, I prefer a younger version of Python than the versions shipped with openSUSE 13.1 - 2.7 and 3.3. Although there are in fact later versions of the Python interpreter available as packages from a number of community repositories, installing them usually messes up distribution-specific symlinks in /usr/bin which must eventually be repaired manually. Besides, I want to avoid installing Python packages system-wide and I want to be able to quickly re-compile and re-install the Python interpreter with different options / flags. In the past, I was also confronted with the need to quickly set up (and maintain) compute nodes. Here, it made sense to equip the nodes with an as rudimentary operating system as possible and deploy the latest version of my evolving virtual environment before every large(r) computation run. In another very common scenario, you might be confronted with a user account without administrative privileges at your work place or university, for which this method also applies.

Combining Python and Node.js is almost a story on its own. While most of my code is written in Python (and C), it occasionally happens that I find a very useful library, which is written in JavaScript and runs on Node.js (or someone deploys some useful JavaScript code in his website and I want to use it). With the increasing popularity of Node.js, the number of actually useful data analysis tools in this ecosystem is increasing dramatically. Therefore I want to be able to quickly install Node.js packages with npm without cluttering my system. Beyond that, circumventing the relatively old version of Node.js shipped by openSUSE makes sense for the exact same reasons as previously described for the Python interpreter. Combining my Python virtual environment (virtualenv) with an isolated Node.js environment is the logic consequence and thankfully there is a (Python) tool for that: nodeevn.

In this blog article, I will compile and install Python 3.4 (the "original" CPython from python.org) from source just like every other subsequent software package with user privileges only.

Prerequisites

Make sure that operating system satisfies all requirements (tools, libraries, headers) of Python and Node.js. On openSUSE, it should be sufficient to install the patterns "patterns_openSUSE-devel_basis" and "patterns_openSUSE-devel_C_C++". Beyond that, make sure that you have a version of readline (5 or 6) and its headers (readline-devel) installed or the Python interpreter will eventually complain on start-up. A detailed description of the Python build process is provided in the Python Developer's Guide. For Node.js, you might have to install openssl and its headers (libopenssl-devel).

1 sudo zypper in patterns_openSUSE-devel_basis \
2 	patterns_openSUSE-devel_C_C++ openssl libopenssl-devel \
3 	readline readline-devel
Prerequisites 1: Installing system-level dependencies with root privileges on openSUSE from a shell

Beyond that, I will explain how to install PyQt4 (which is required by matplotlib for example). Please note that it is required neither by Python nor by Node.js, so might want to ignore those sections. However, if you want to install PyQt, you also need to ensure that you have the required development packages for Qt installed on your system.

1 sudo zypper in libqt4-devel
Prerequisites 2: Installing Qt headers with root privileges on openSUSE from a shell
Getting started: Compiling Python, creating a virtual environment

For most of my work, I prefer to use a central project folder, which is located on my desktop. Let's go there.

1 cd ~/Desktop/PROJEKTE
Step 1: Go to project folder

Before I begin install anything, I create a bunch of folders for future use. The first one, "_python34", will eventually become the Python root directory. The second one, "_env34a", will hold my virtual environment. Note that I name the folders after the Python version (3.4) they contain while the appended "a" indicates that the folder contains the first virtual environment based on this version of Python. Systematic naming will significantly ease the management of additional interpreters and virtual environments.

1 mkdir _python34
2 mkdir _env34a
Step 2: Create new folders for Python root and a virtual environment

Let's download the source code of the latest version of CPython 3.4 from python.org:

1 wget https://www.python.org/ftp/python/3.4.4/Python-3.4.4.tar.xz
Step 3: Download Python source code

The xz-compressed tar-ball must be unpacked, which leaves a new directory named "Python-3.4.4" in my project folder. I prefer to add "src." as a prefix to the names of folders which contain source code. Then I change into the source code folder.

1 tar xpvf Python-3.4.4.tar.xz
2 mv Python-3.4.4 src.Python-3.4.4
3 cd src.Python-3.4.4
Step 4: Unpacking Python source code, changing folder name, changing into it

Now I configure, compile and install the Python interpreter. Do not forget to point to the previously created Python root directory using the "prefix" switch when configuring the source code.

1 ./configure --prefix=$HOME/Desktop/PROJEKTE/_python34
2 make
3 make install
4 cd ..
Step 5: Configure, compile, install, change to project folder

If the above process exited successfully, it should be possible to create and "source" a virtual environment.

1 cd _python34
2 ./bin/python3.4 -m venv ../_env34a/
3 cd..
4 source _env34a/bin/activate
Step 6: Create a virtual environment, activate it

It is a good idea to link to the Python headers from within the virtual environment. Some programmes will expect it there.

1 cd _env34a/include/
2 ln -s ../../_python34/include/python3.4m/ python3.4m
3 cd ../../
Step 7: Linking to Python headers
Installing PyQt4 into the virtual environment

One common dependency for many GUI applications happens to be Qt because it is simply one of the best open source multi-platform GUI libraries. Most Qt applications still use Qt4 (though Qt5 is increasing in popularity), so it is usually a good idea to install PyQt4. This is where things become a little bit challenging. PyQt4 does not support being installed into a virtual environment out of the box. I need to apply a few tweaks to make it work. However, before I can even look at Qt, I must install SIP. SIP is a Python framework for accessing large C/C++ libraries written by same people who also maintain PyQt. It is important to find a matching pair of SIP and PyQt because otherwise the installation of PyQt might fail. Because it is unfortunately impossible to determine a match based on the version numbers, one must dive a little deeper into Sourceforge.net and check when the desired version of PyQt was uploaded. Then one can go for a version of SIP which was published around the same date.

openSUSE 13.1 ships with Qt 4.8.5 which can be determined by running qmake.

1 ~> qmake --version
2 QMake version 2.01a
3 Using Qt version 4.8.5 in /usr/lib
Step 8: Determining the version of Qt

For this version of Qt, my matching (and working) pair of SIP and PyQt is sip-4.16.9 and PyQt-x11-gpl-4.11.4, both published on August 1st 2015. You can find both on Sourceforge.net here and here. Once I have downloaded the corresponding files, I move them to the project folder and unpack them with the following commands.

1 tar -xvzf sip-4.16.9.tar.gz
2 tar -xvzf PyQt-x11-gpl-4.11.4.tar.gz
3 mv sip-4.16.9 src.sip-4.16.9
4 mv PyQt-x11-gpl-4.11.4 src.PyQt-x11-gpl-4.11.4
Step 9: Unpacking (and renaming) of SIP and PyQt source code

Now I can configure, compile and install SIP. Note that one must point to the Python include directory with its absolute path. Anything else does not seem to work. For the configuration of the source code, I am already using the Python interpreter from the virtual environment.

1 cd src.sip-4.16.9
2 python configure.py \
3 	--incdir=$HOME/Desktop/PROJEKTE/_env34a/include/python3.4m/
4 make
5 make install
6 cd ..
Step 10: Configure, compile, install

At this point, I can build PyQt. First, I go into its source code directory and configure the source.

1 cd src.PyQt-x11-gpl-4.11.4
2 python configure.py
Step 11: Configure PyQt source

The created "Makefile" needs a few small changes. I open it with a text editor and change the paths in lines 52 and 53 as follows:

1 @test -d $(DESTDIR)/home/ernst/Desktop/PROJEKTE/_env34a/share/qt4/qsci/api/python || mkdir -p $(DESTDIR)/home/ernst/Desktop/PROJEKTE/_env34a/share/qt4/qsci/api/python
2 cp -f PyQt4.api $(DESTDIR)/home/ernst/Desktop/PROJEKTE/_env34a/share/qt4/qsci/api/python/PyQt4.api
Step 12: Adjusting the Makefile

Note that the line numbers in your Makefile might differ from mine though it should be relatively easy to find the right ones. You should look for lines containing install instructions for "qsci".

Now I can actually compile and install PyQt4 into the virtual environment. This step can take quite a while.

1 make
2 make install
Step 13: Compile and install PyQt
Installing relevant Python packages

After the installation of PyQt is complete, it is time to install further "must-have" Python packages. But before that, you should not forget to update pip to its latest version.

1 pip install --upgrade pip
2 pip install -v numpy
3 pip install -v pymongo
4 pip install -v h5py
Step 14 (a): Update pip, install Python packages mandatory for data analysis

If you have successfully suffered through the installation of PyQt, you can now also install matplotlib.

1 pip install -v matplotlib
Step 14 (b): Install matplotlib
Installing Node.js into the virtual environment

With the last step finalized, my baseline environment for development and data analysis is almost finished. The only thing missing is Node.js. As mentioned before, it is my goal to install Node.js directly into the virtual environment. Therefore I need another Python package: nodeenv. It takes care of downloading, compiling and installing Node.js from within the virtual environment and automates the process almost entirely. Let's install it.

1 pip install -v nodeenv
Step 15: Install nodeenv with pip

Nodeenv is a rather versatile tool. It is definitely worth reading its manual as well as a number of blog posts illustrating its capabilities like this one for example. For my purposes, it is enough to run it once with the "p" switch enabled. This will give me Node.js and npm.

1 nodeenv -p
Step 16: Install Node.js and npm with nodeenv

Let's test npm by installing the d3 package for Node.js. Do not forget to enable the "g" switch.

1 npm install d3 -g
Step 17: Installing d3 with npm

Sebastian M. Ernst