Machine Learning using Python and Spark with PySpark
Tuesday 18th, 14:30 (Room B)
Tuesday 18th, 11:30 (Room B)
If you have run any big data projects, you have almost certainly heard of Apache Spark.
Please sign up to a community edition of Databricks at www.databricks.com as you will need this
for this workshop.
You will get a better understanding of the Apache Spark ecosystem, what makes Spark tick, RDDs
and Spark's Dataframes API. Getting data in the format you need it is fundamental to machine learning.
You will have an opportunity to work through a dataset on Databricks platform and see how easy it is to use and manipulate.
We will end the workshop with a worked machine learning example.
- The speaker suggested this session is suitable for data scientists.