DS2: Python for scientific computing

Synopsis

As a recurrent topic for a few years, the Data Science gathers subjects like statistics, machine learning, computer science and the domain expertise. Machine learning methods are characterized by algorithms that allow problem solving starting from data. Considering its definition, Data Science is highly linked to scientific computing.

This training will be dedicated to the study of such a computing with the help of Python. With its rich ecosystem, all its associated libraries and its standard library extent, this programming language allows to address a wide range of analysis.

Practical exercises will take the main part of the training; it will permit the course participant to be familiar with the main Python scientific-computing-related libraries.

Goals

Thanks to this training, you will develop the following skills:

  • Know the Python scientific computing ecosystem
  • Have a good knowledge of main libraries, like numpy, pandas or matplotlib
  • Handle data sets

Duration

3 days

  • Basis in algebra and numeric calculation
  • Notions in Unix-like environment
  • A past experience in another scientific computing environment (R, Matlab, Octave) is a plus

Program

This program is indicative. It could be adapted to your specific needs.

  • Working environment configuration

    • Python, ipython and jupyter-notebook setting up
    • Numeric computing libraries setting up:
      • numpy,
      • scipy,
      • pandas,
      • matplotlib,
    • Virtual environment management

  • Python standard library reminder

    • Data types
    • Matrices
    • Item indexation, selection, insertion, deleting
    • Going further with scipy

  • Handling dataframes with pandas

    • Data types
    • Read, write csv datasets
    • Select, add, process records
    • Missing data management
    • Aggregation
    • Time series handling

  • Data visualization with matplotlib, seaborn and folium

    • Plot a simple graph (point cloud)
    • Plot other types of graph (curves, boxplots, barplots, histograms, …)
    • Customize the graphs
    • Going further with seaborn
    • Map visualization with folium

  • Case study: geospatial data analyze

    • Reading/Writing from/to a csv file
    • Elementary statistics and feature interpretation
    • Data handling and machine learning algorithm conception with scikit-learn
    • Data visualization

DS2 – Data Science

Introduction to scientific computing

The next courses in Paris :

Contact us for mostly on-site trainings at your office (dates are flexible to your needs).