SYNOPSIS

As a recurrent topic for a few years, the Data Science gathers subjects like statisticsmachine learningcomputer science and the domain expertise. Machine learning methods are characterized by algorithms that allow problem solving starting from data. Considering its definition, Data Science is highly linked to scientific computing.

This training will be dedicated to the study of such a computing with the help of Python. With its rich ecosystem, all its associated libraries and its standard library extent, this programming language allows to address a wide range of analysis.

Practical exercises will take the main part of the training; it will permit the course participant to be familiar with the main Python scientific-computing-related libraries.

GOALS

Thanks to this training, you will develop the following skills:

  • Know the Python scientific computing ecosystem
  • Have a good knowledge of main libraries, like numpypandas or matplotlib
  • Handle data sets

PROGRAM

This program is indicative. It could be adapted to your specific needs.

  • Working environment configuration
    • Pythonipython and jupyter-notebook setting up
    • Numeric computing libraries setting up:
      • numpy,
      • scipy,
      • pandas,
      • matplotlib,
    • Virtual environment management
  • Python standard library reminder
    • Data types
    • Matrices
    • Item indexation, selection, insertion, deleting
    • Going further with scipy
  • Handling dataframes with pandas
    • Data types
    • Read, write csv datasets
    • Select, add, process records
    • Missing data management
    • Aggregation
    • Time series handling
  • Data visualization with matplotlibseaborn and folium
    • Plot a simple graph (point cloud)
    • Plot other types of graph (curves, boxplots, barplots, histograms, …)
    • Customize the graphs
    • Going further with seaborn
    • Map visualization with folium
  • Case study: geospatial data analyze
    • Reading/Writing from/to a csv file
    • Elementary statistics and feature interpretation
    • Data handling and machine learning algorithm conception with scikit-learn
    • Data visualization

DURATION

3 days

PRE-REQUISITE

  • Basis in algebra and numeric calculation
  • Notions in Unix-like environment
  • A past experience in another scientific computing environment (R, Matlab, Octave) is a plus

The next courses (Lyon or Paris):

.

Contact us for on-site trainings (dates are flexible to your needs).

You want to participate in this training ?

Please give us the details below if you can:

* Training

Place of training, Number of people involved, Initial level of participants, Time constraints, Specific expectations

* Contact details

Organization, Address, Contact, Email, Intracommunity VAT