Loading Events
Sep 06

Data Analysis and Data Preparation for Machine Learning

By EuroCC Austria
By EIT Manufacturing CLC East
By Vienna Scientific Cluster

Event Details

Level: beginner

In this one-day online course you will learn how to:

  • Get data into a suitable shape before feeding it into Machine Learning (ML) algorithms
  • Visualize data
  • Clean data
  • Transform data
  • Analyze data
  • Handle data that does not fit in memory


Participants learn why data needs to be pre-processed before being passed to ML methods. They also learn what the typical challenges are in data wrangling.

Participants get to know this powerful Python library and find out how they can load data into a data frame, get the look and feel of it and transform it in the best suitable way.

ML would simply not be possible in Python without this useful library for numerical operations. This is why participants will get to know the most important aspects of the API and what can be achieved with it.

Humans are visual beings and this is why we prefer looking at graphs, rather than endless tables of data. Matplotlib is the Python library to create all kinds of graphs which helps understand data a great deal more. Participants will learn how to create the most common graphs within Matplotlib.

In ML problems, we often get to a situation where our data does not fit into memory. Even if it fits into memory, we would like some operations to run faster. Dask solves this problem by dividing our data into smaller, more manageable chunks. It then runs computations on those chunks in parallel, making it possible to handle data that is larger than memory. It is also faster since it makes computations run concurrently. Participants will get to know this tool and see the similarities with previously learned libraries.

Course format

The training will be held online from 10:00 – 16:00 CEST with a 1-hour break at 12:00. The participation links will be provided after the purchase and before the training.


The participants are expected to have at least basic programming skills in Python.

The programming language of choice is Python and participants will get to know libraries such as NumPy, Pandas, Scikit-Learn, Matplotlib and Dask.

The content is delivered with Jupyter notebooks on Google Colab, so participants should have a Google account in order to be able to participate fully.


Full price for the course with course documentation: € 120,00 (including VAT)


Upon completion of the online training, participants will receive a certificate of attendance.


Simeon Harrison (EuroCC Austria and VSC Research Center, TU Wien)


Rosina Preis (Competence and Knowledge Manager for EU Projects CLC East)

Simeon Harrison (Trainer and Coordinator Training for Industries, EuroCC Austria and VSC)


The When

September 6, 2023
10:00 am - 4:00 pm

The Where


The Who

Hosted by EuroCC Austria

Publish Your Startup Events

The majority of the events listed on our site are community-generated. We invite you to add relevant events, whether ones you or your company hosts, or ones you simply wish to expose to our community. Publishing is quick and free!

Never miss an event

Join our newsletter and be the first to know of any new events taking place in your area. We will never spam or bother you, and 1-click opting-out takes 5 seconds!