This post explains how to setup a powerful spatial data store with a wide range of features on Ubuntu 14.04 (Trusty) in 2 command lines. And how it works.
TL;DR
Run this on Ubuntu 14.04 Trusty Tahr, and enjoy :
Database credentials : pggis/pggis/pggis for database/user/password
(DO NOT USE THIS IMAGE FOR PRODUCTION).
More info
You want to store and manipulate spatial data. Lots of spatial data. Various spatial data. This Docker image lets you install the latest components based on the most powerful opensource database (PostgreSQL, in case you wonder) and benefit from a wide range of features.
Docker is a Linux container based technology. See containers like something between the good old chroot and a Virtual Machine. Fast, efficient and painless. See Docker as a mix between a meta-package manager, git and a whale. Docker is available natively on Ubuntu 14.04 (and named docker.io), and can also run on other Linux flavors, or in a minimalist Virtual Machine with Boot2docker for other environments (Windows, MacOSX).
This image focus on installing and running the following components :
- PostgreSQL 9.3
- PostGIS 2.1.3 with SFCGAL support
- PgRouting
- PostgreSQL PointCloud
- PDAL
PostgreSQL is the advanced database providing a lot of features for data store and management. Version 9.3 is the latest stable version (9.4 is just around the corner), providing support for streaming replication, JSON, CTE, window queries and much, much more.
PostGIS is a PostgreSQL extension, enabling spatial support. It features new data types, such as 2D (and 2.5D) vector data, be it projected or in lat/lon. It also offers raster data support, to store gridded image data, and topological vector data storage.
SFCGAL support for PostGIS adds 3D features to the database. You can store 3D meshes, TIN, and operate on those objects.
PgRouting is a PostgreSQL/PostGIS extension allowing you to perform routing in the database on topological spatial data. It can be used to dynamically compute shortest paths, driving distances and more.
PostgreSQL PointCloud is a PostgreSQL/PostGIS extension which lets you deal with huge amounts of points, e.g. LIDAR data. PDAL is a library and set of tools allowing you to load and extract data from and to PointCloud (among other things).
All of this makes the best platform for spatial data crushing. Loads of spatial data.
Using it
Below is a longer full explanation to install this image and use it.
Install Docker
You first need to install docker tools on your Ubuntu system. You can install Docker on a lot of recent Linux flavors or in a Linux Virtual Machine ( with Boot2docker for example).
Get the image and run the container
The following commands use docker to download the image `from the docker registry <https://index.docker.io/u/oslandia/pggis/>`_ (~1GB) and then run it in the foreground. If you omit the **pull** command and try to run it directly, Docker will look for the image on the Docker registry if not already present locally.
Your container is initiated from the oslandia/pggis image. It is named pggis_test, runs in the foreground (default), will redirect all exposed ports to a host port ( -P ), and will delete all created filesystem and container image when the container exits ( -rm ). It will run the /sbin/my_init startup script of baseimage to launch all necessary processes inside the container.
Now PostgreSQL runs in your container with all extensions activated, and a new database created just for you.
Connect to the database
Assuming you have the postgresql-client installed on your host, you can use the host-mapped port to connect to the database. You need to use **docker.io ps** to find out what local host port the container is mapped to first:
You can see that te container’s port 5432 has been mapped to the host’s port 49154 in this case (yours may differ). We can now connect to the database server through this host port. A pggis user has been created (with pggis as password) and a corresponding database with extensions activated.
Enjoy GIS features
You are now ready to use the database. Inside the psql console, you can check that everything is in order with queries like below.
Describing all features of installed components would require a lot more posts, coming later. If you are interested in learning more, Oslandia can provide training, assistance and support.
Warning
Do not use this in production !
The PostgreSQL configuration in this image opens the database connection without any IP restriction for the pggis super-user. The default pggis password is very weak too. This should NOT BE USED IN PRODUCTION, and you should adapt the configuration to your specific setup and security level. See below to rebuild the image with different configuration. In a production environment, you would also want to run the container in the background ( -d option) and keep the container filesystem and images after running ( no –rm option ).
How it works
This image is a Docker image, which uses Linux Containers (LXC). The base image is `Phusion baseimage <http://phusion.github.io/baseimage-docker>`_ , which it itself a Ubuntu 14.04 (Trusty). It is is built using a Dockerfile (see the source on Github). This Dockerfile is used by Docker to build the pggis image, by processing all installation, download, compilation and setup of the various components. The result is a Docker image, which is then published on docker.io.
You can either directly download and use the image from docker.io as explained above, or rebuild it from the Dockerfile in the git repository like below.
Then clone the GitHub repository to get the Dockerfile :
Then you can build the image, using the Dockerfile in the current repository.
Note that building the image requires quite a lot of download from internet, as well as enough RAM and CPU. Compilations are achieved along the way, with make -j3, thus enabling parallel compilation. You should adapt this value according to the number of CPUs you have.
Once you have built the image, you can run a container based on it just like above.
If you want to change how the container works, or the default setup, you can edit the Dockerfile and rebuild the image. Docker caches every step of the build process (after each RUN command), so rebuilding can be very fast.
If you have a docker.io account, you can upload the image to the image registry like this
Conclusion
This is a first dive into Docker, with a spatial GIS database container. There is a lot more you can do with Docker, and plenty of use cases for this spatial database setup. This Docker image, the setup and the way it is built can be improved a lot as well, and will certainly change along with package availability, ppa repositories, next versions of components, better Docker practices and versions, but it is already a good and efficient way to get the stack up quickly. Do not hesitate to fork the GitHub repository and send Pull Requests.
You can also report any issue on GitHub https://github.com/vpicavet/docker-pggis/issues
If you want to go further, or if you have any remark concerning the tools covered in this blog post, do not hesitate to contact us : infos+pggis@oslandia.com