DS6: Open data case study

Synopsis

The open data portals have become a reality for a few years. However, data availability does not completely solve the challenge of their exploitation. The user still has to juggle with a wide range of portals, a lack of data normalization on them regarding data format and/or dataset structure and with short availability through time.

This training will allow participants to become aware of how to exploit such open data, thanks to the development of a tangible example: shared bike systems in two large french cities (namely, Bordeaux and Lyon). From data gathering to analyze, and even until data restitution onto a web API, this training will illustrate a whole open data exploitation pipeline.

Goals

Thanks to this training, you will develop the following skills:

  • Know how to get an open dataset on a public portal
  • Request a database from Python
  • Do a simple statistic analysis
  • Share its result through a web API

Duration

3 days

  • Ease in Python programming language
  • Knowledges in databases and SQL language
  • Knowledge about most common data formats (csv, json)
  • Notions about web programming (scrapping, API conception)

Program

This program is indicative. It could be adapted to your specific needs.

  • Data extraction from public open data portals

    • Discovering Bordeaux and Lyon open data portals
    • Get a simple dataset from data portals (shared bike availability)
    • Put bike data in database
    • In-base dataset handling wiwth psql (PostgreSQL)
    • Dataset handling with Python
    • Get the data automatically with a CRON job

  • Statistical analysis of shared bike availability

    • Data description: elementary statistics
    • Feature extraction: how to create additional information
    • Shared-bike station classification starting from their availability profile
    • Short-term bike availability prediction

  • Data visualization

    • Plotting geo-referenced data in QGIS
    • Build a simple web API to visualize in-base data

DS6 – Data Science

Open data case study

Contact us for on-site trainings (dates are flexible to your needs).