DS5: OpenStreetMap data


OpenStreetMap (OSM) is one of the most noteworthy open data creation initiative from the past years. It is community-based: a wide contributor network takes part to the building of this world map, by producing open cartographic data days after days. OSM allows to rethink all data acquisition, quality measurement and updating techniques.

With more than 800,000 contributors, it is sometimes hard to understand what this huge bunch of data and metadata is exactly, and whose information it contains. Consequently a smart exploitation of its API may look difficult: how to represent points of interest, different types of roads and buildings, and so on…

This training will permit to know better the OpenStreetMap data, its content as well as the way it can be mined. A particular focus on data quality will be done, as it highly depends on the contributors. History of OSM objects will be detailed, so as to characterize this quality. To contribute to OSM API is not a goal of this training, however it will provide some comprehension keys for understanding how the platform is filled by contributors and how it evolves through time.


Thanks to this training, you will develop the following skills:

  • Know the dynamics and specificities of the OSM project, as well as the underlying data structure
  • Extract OSM data with Python
  • Quantify the OSM API evolution through time
  • Inventory the tags associated with OSM objects
  • Understand how an area is mapped with the help of contributor interactions


2 days

  • Knowledges in data science
  • Notions in Python and SQL

See also DS3: Python for data science and DS4: Data Science for GIS


This program is indicative. It could be adapted to your specific needs.

  • The OpenStreetMap project

    • History, objectives, organization
    • How to contribute?

  • Gather OpenStreetMap data

    • Bounding boxes within OSM API
    • Regional areas in GeoFabrik
    • OSM object extraction with pyosmium

  • OpenStreetMap data mining

    • Where to find the information?
    • Put OSM data in a database with osm2pgsql, imposm3

  • OpenStreetMap data exploitation

    • Handle OSM data with pandas
    • Set up a data pipeline with Luigi
    • Temporal evolution of OSM data
    • OSM tag genome analysis
    • Analyze the history of contributions
    • Data visualization with Python and QGIS

DS5 – Data Science

OpenStreetMap data

Contact us for on-site trainings (dates are flexible to your needs).