SATRDAY

The R community and some of South Africa's most forward thinking companies have come together to bring satRday back for its fourth edition. This conference provides an opportunity to hear from and network with top Researchers, Data Scientists and Developers from the R community in South Africa and beyond.

Speakers

Keynote Speakers

Colin Fay
Data Scientist & R Hacker
ThinkR

Colin helps companies to take full advantage of the power of R by providing training (from beginner to expert), tools (packages, web apps...) and infrastructure. His main areas of expertise are data & software engineering, web applications (frontend and backend) and R in production.

Colin is a hyperactive open source developer and an open data advocate. He is very active in the Data Science community in France where he founded the data-blogging website Data-Bzh.fr, co-founded the Breizh Data Club association and organises the Breizh Data Club Meetups.

Heather Turner
Freelancer
Statistical Programmer

Heather provides support to people in a range of industries, but particularly the life sciences. She has been an R user since 2001 and is co-author of several CRAN packages, notably the statistical modelling packages gnm, BradleyTerry2 and PlackettLuce.

Dr Turner is on the board of the R Foundation and chairs the Forwards taskforce for underrepresented groups in the R community. She is also a co-organizer of R-Ladies Remote.

2020 Speakers

Robert Bennetto
CTO
Chalcid
Bianca Peterson
Director & Trainer
Conquest Analytics & Training
Anelda van der Walt
Consultant & Trainer
Talarify
Cesaire Tobias
Consultant
Fractal Value Advisors
Hanjo Odendaal
Data Scientist
71Point4
Megan Beckett
Data Scientist
Exegetic Analytics
Ray Bosman
Senior BI Developer
Derivco
Diana Pholo
Data Scientist
Predictive Insights
Francois van Heerden
Senior Analytics Researcher
Kantar
Gao Maribe
Data Scientist
DotModus
Natasha Sing
Senior Data Analyst
Derivco
Caroline Otiwa
Advocacy and Operations Lead
Women in GIS,Kenya (WiGISKE)
Hendrik van Broekhuizen
Senior Data Scientist
Predictive Insights
Astrid Radermacher
Postdoctoral researcher
University of Cape Town
Anisa Mathura
Actuary
Faculty of Actuaries
Andrew Collier
Data Scientist
Exegetic Analytics
Kirsty Lee Garson
Intern
Exegetic Analytics
Michael Johnson
BI Architect
SQLSA
Drikus du Toit
Data Scientist
Capitec Bank

Venues

Workshops

PwC has graciously agreed to host this year's workshops at their offices in Midrand.

Conference

satRday will once again be hosted at the prestigious, Discovery Head Office, in Sandton. Situated in the hub of Johannesburg the venue boasts a 5-star Green Star SA rating by the GBCSA.

Discovery HQ exterior
Discovery HQ interior

Discovery reception area

Atrium inside Discovery HQ

Workshops

satRday Johannesburg will kick off, at PwC, on the 6th of March 2020 with a day of workshops. Note that these will be full-day workshops and take place in parallel, so you have to choose between them - good luck!

(*) Workshops available to training pass holders only.

R Package Development

Heather Turner

Curious about package development but not sure where to start? This workshop is for you! The goal is to empower you to contribute back to the R ecosystem through writing your own packages or contributing to others. We’ll be using materials developed by Hadley Wickham, Jenny Bryan, and others on the Forwards teaching team.

By the end of this workshop, you should know how to:

  • turn your code into an R package,
  • use GitHub as an effective collaboration tool,
  • add a vignette or an article,
  • build a web page for your package, and
  • submit a package to CRAN.

Prerequisites: Participants should know how to write functions in R. Knowledge of R markdown will be beneficial.

Computing requirements: Attendees are encouraged to set up the required tools on their laptops as described here. Alternatively, an RStudio cloud with the required tools will be accessible for the duration of the workshop.

Participants may bring their own code that they want to make into a package, or work with the example provided.

Building Successful Shiny Apps with {golem}

Colin Fay

Shiny is an amazing tool when it comes to creating web applications with R. Building a proof-of-concept application is easy, but things change when the application becomes larger and more complex, especially when it comes to sending that app to production. Until recently, there hasn't been any real framework for building and deploying production-grade Shiny apps. This is where {golem} comes into play: offering Shiny developers an opinionated framework for creating production-ready Shiny applications.

With {golem}, Shiny developers now have a toolkit for making a stable, easy-to-maintain, and robust for production web application with R. {golem} has been developed to abstract away the most common engineering tasks (for example, module creation, the addition of external CSS or JavaScript file, ...), so you can focus on what matters: building the application. Once your application is ready to be deployed, {golem} guides you through testing and brings you the tool for deploying to common platforms.

In this workshop, attendees will be introduced to the {golem} package, then guided through the full development of a shiny app, from start to deployment!

Prerequisites: Attendees are expected to be already familiar with Shiny. Knowledge about package development would be advantageous.

Web Scraping with R

Megan Beckett & Andrew Collier

Often the data you need is already available on a website. It might all be on one page (if you're lucky!) or distributed across many pages (possibly hundreds or thousands of pages!).

But you want those data consolidated locally. Not on a server in some distant land, but right there on your hardware. And in a convenient format. CSV or JSON, perhaps? Certainly not HTML!

What would Ragnar do? He'd go out, grab those data and bring them home.

The contemporary Internet Viking uses web scraping techniques to systematically extract information from web pages. This workshop will demonstrate the process of web scraping. Here’s the battle plan:

  • Sharpening the Axe: Understanding the structure of an HTML document.
  • Preparing the Longships: Using the DOM to select HTML elements.
  • Doing Battle: Using {rvest} to extract data from an HTML document.
  • Stashing the Treasure: Storing data as CSV or JSON.
  • Triumphant Return: Handling dynamic content using {RSelenium}.

The first two topics will be fairly brief, covering this material at a high level. We'll dig much deeper into the latter topics. By the end of the workshop you will be able to easily (and confidently) scrape large swathes of the internet. We’re considerate Vikings, so you’ll also learn how to do this ethically and mindfully. :)

This tutorial will be suitable for Vikings with low to moderate levels of R experience. We'll use RStudio Cloud to ensure that everybody has the same infrastructure and (hopefully) avoid most technical issues.

Introduction to R

Bianca Peterson

This hands-on, two-day workshop is ideal for anyone relatively new to R who is looking to expand their knowledge in a friendly and welcoming environment. The aim of this workshop is to empower you with the right skills (and confidence) in order for you to tackle your Real world problems.

Ready for the best part? No prior knowledge or computational experience in R required to rock this workshop! We will cover everything from installation to advanced visualisations using ggplot2!

Registration

Workshop Pass

A workshop pass gives you access to one of the workshops. These full-day workshops will be run in parallel on the 6th March 2020 at PwC.

Workshop Pass
(Early Bird R 1000.00)
1 Day Workshop
Lunch
Networking breaks with refreshments provided

Conference Pass

The Conference Pass gives you access to South Africa's fourth satRday. Join us on the 7th March 2020 at, Discovery Head Office, to meet with and hear from both local and international R enthusiasts!

Conference Pass
(Early Bird R 200.00)
All Conference Talks
Lunch
Networking breaks with refreshments provided
* Early Bird tickets are available until 2020-01-26 (or until sold out). The standard price for Conference and Workshop passes will be R 500 and R 1500 respectively. No late or at the door registration available.

R-Ladies Event

When: March 5, 17:30 - 19:30 Where: Rain

Heather Turner will be taking us through: Publishing and Promoting your R Package, and Colin Fay will be talking to us about contributing to the R ecosystem.

This event, although primarily aimed at women, welcomes all who are interested as it will provide a great opportunity for all current and prospective R users to meet and chat with like-minded people.

If you are interested in attending similar events in the future please get in contact with us via our page or our .

Programme

Workshops Programme

StartEndFriday 6 March 2020
8:30 9:00
Registration
9:00 10:30
First Session
10:30 11:00
Tea / Coffee
11:00 12:30
Second Session
12:30 13:30
Lunch
13:30 15:00
Third Session
15:00 15:30
Tea / Coffee
15:30 17:00
Fourth Session

Conference Programme

Standard talks are 20 minutes and lightning () talks are a mere 5 minutes.

Click on the title for any talk to view the details.

StartEndSaturday 7 March 2020
8:00 8:30
Registration
8:30 8:35
Welcome
8:35 10:30
First Session
Bianca Peterson
  • afrimapr: Creating reusable R software building blocks for mapping in Africa in 2020 (Anelda van der Walt)

    The afrimapr project will create software building blocks in R and learning resources to facilitate the use of spatial data in health (and other) applications in Africa. The project will promote these resources to initiate a community of users and developers to maintain and improve them.

    Co-authors:

    • Dr Andy South (Liverpool School of Tropical Medicine, UK)
    • Dr Paula Moraga, (Lancaster University, UK)
    • Dr Julie-Anne Tangena (Liverpool School of Tropical Medicine, UK)
    • Dr Robin Lovelace (University of Leeds, UK)
    • Dr Margareth Gfrerer (Education Strategy Center, Ethiopia)

  • The magic of colour: the pursuit to derive several hundred unique palettes (Francois Van Heerden)

    Our final outputs from R will often take on the form of a graph, map, infographic or other relevant visual. These visualisations are not devoid of creative elements. They require colour, and sometimes they require a multitude of custom palettes for the task at hand. How best can these be derived?

  • Easy-peasy plots with {esquisse} (Bianca Peterson)

    Do you wish you could drag and drop variables onto X and Y axes and easily map variables? Are you in a hurry (or just plain lazy) to type code by hand? The {esquisse} add-in lets you explore and visualise your data with {ggplot2} interactively and provides the generated code to reproduce the chart!

  • Mapping out African genomics research with 'sf' (Kirsty Lee Garson)

    Advances in our understanding of the human genome have largely been based on European populations. More recently, a number of studies have been investigating African populations, which have a higher degree of genetic diversity. What better way to map them out than with a bit of help from sf?

  • Productivity Hacks with your .Rprofile (Cesaire Tobias)

    Have you ever wanted a set of productivity boosting tools to help make your workflow more efficient? Do you ever wonder what anyone does with their .Rprofile? If you answered yes to both these questions then this lightning talk is for you.

  • Algorithmic Financial Planning (Anisa Mathura)

    My talk focuses on the construction of a generalized additive model (GAM) to relate healthcare service efficiency and operational metrics to clinic profitability (profit per day). My modelling used vtreat for R (John Mount and Nina Zumel) as it made the modelling process and output a tidy-treat!

  • SHAP: Interpreting ML Models with IML (Drikus du Toit)

    Interpretability is still regarded as one of the main barriers of machine learning. How do we explain a model decision to a client? How do we measure the underlying interactions in a black box model? Let's have a look at Shapley Values as a model agnostic technique for interpreting ML models.

  • purrr beyond map() (Hendrik van Broekhuizen)

    There is so much more to purrr than map(). purrr is a veritable treasure trove of functional awesomeness that can help transform the code you write and what it can do. In this talk, I illustrate what lies beyond map(), how to use it, and why it may be worth exploring purrr's hidden depths.

10:30 11:00
Coffee/Tea
11:00 12:30
Second Session
Vebashini Naidoo
  • Productionalizing R code using Databricks (Michael Johnson)

    One of the most challenging tasks in data science is moving code from a development through to production. Build on top of Apache Spark, Databricks provides streamlined workflows and interactive workspaces that enable collaboration between Data scientists, data engineers and business analysts.

  • Taking R and shiny to production (Gaonyalelwe Maribe)

    This talk shows the endless possibilities of R and shiny's capabilities in a production environment using docker and ShinyProxy. In this talk I will demonstrate how to build and deploy a shiny application (statistical ad hoc report) for each user in an isolated docker container on a linux server.

  • Keynote: "prod" is not a four-letter word (Colin Fay)

    Sponsored by Discovery.

    The idea that R is not a good fit for engineering production software is a catchphrase we hear a lot in the world of data science, and in the software engineering community in general. But the truth is that the R community has been creating tools for building robust software for years—tools that have been proven to be reliable and efficient, making R a legitimate language for production.

    With all the available tooling, the real challenge of sending R to production is not technical—it's all a matter of attitude. In other words, the real challenge is our ability to adopt a software engineering mindset when building tools with R.

    So where do we start our journey to production quality with R? What are these tools that help us building reliable products? How do we think about R with 'deploy to production' in mind? These are the questions, amongst others, that Colin will try to answer during his keynote, which will hopefully help you become more confident about sending R into production.

12:30 13:30
Lunch
13:30 15:00
Third Session
Gemma Dawson
  • Unravelling the Mysteries of Resurrection Plants Using R (Astrid Radermacher)

    Resurrection plants are capable of surviving total water loss and prolonged metabolic inactivity in the dry state - plant biltong for years! Once water is available, they miraculously recover within hours. How R they capable of this? Come find out! ;)

  • Analysis of the legal frameworks for gender equality and non-discrimination: SDG 5 (Caroline Otiwa)

    According to UN Africa is affected by Culture,Traditions and beliefs in the journey towards achieving SDG 5. There are legal frameworks that have been put In place to ensure that countries get here. This study uses R to conduct a NEP analysis that will show which countries have reached the threshold

  • Keynote: Diversity and Inclusion in the R Community (Heather Turner)

    In the past few years, the R Community has been making conscious efforts to widen the participation of underrepresented groups. This talk will review some of the initiatives that have contributed to this effort, including Forwards, a task force set up by the R Foundation.

    A lot of progress has been made and the R Community is now recognized for its inclusiveness. However, there is still room for improvement, particularly in the technical side of the R project (developer meetings, package development, R core development) and in the general user community for specific groups (people living in underserved regions or people with disabilities that affect access). This talk will consider these challenges and discuss some ways we might move forwards to achieve greater equity and inclusion.

15:00 15:30
Coffee/Tea
15:30 17:00
Fourth Session
Deveshnie Mudaly
  • Running in Circles: Spatial Optimisation with OSRM and R (Megan Beckett)

    Visiting a new place on holiday or travelling for work? What can you see from where you’re staying? What are the shortest routes? To answer these questions, you’d probably ordinarily reach for your phone and open up Google Maps. However, as a data scientist, with a specific set constraints and conditions (including having my running shoes on), I want to delve a bit deeper, explore and map out my options. My first tool to reach for is R. Then, the Open Source Routing Machine (OSRM).

    OSRM is a high-performance routing engine for calculating the shortest paths through a road network. These calculations are available via Google Maps. However, queries against a local OSRM server are orders of magnitude faster than Google Maps (and cheaper!). The osrm package for R exposes these routing data to a wide range of potential applications. In this talk I'll show how to easily spin up and provision an OSRM server and use it to solve some interesting spatial optimisation problems in R.

  • Sporadic Retail Inflation (Andrew Collier & Emma Collier)

    We've been gathering retail pricing data. As an exercise we took a look at how prices change around Valentine's Day and found some interesting things.

  • Willy Wonka’s Chocolate Factory: to predict or not to predict? (Natasha Sing)

    I will use Google Trend data which is like a hidden closet of thoughts, together with Prophet, an open-source library published by Facebook to predict trends in chocolate. 

  • Factorization Machines (Ray Bosman)

    A novel demonstration of the Funk FM styled factorization machine popularized by Simon Funk in 2006 (Netflix prize). My intent started off as a recommendation system but evolved into a way to impute pixels into corrupted images from a synthesized matrix.

  • From Pythonista to Rtist (Diana Pholo)

    As a Pythonista, moving to R for a new job was not the most natural thing. I had to learn so many things! 6 months into R, I would like to share a few things about:

    • How Python differs from R
    • Which R tools to learn as a beginner Rtist
    • Which online resources can help ease the transition
  • Building sarbR (Hanjo Odendaal)

    At the time of this publication, the SARB still has no easy way of accessing their data through an programmatic interface (API). The package aims to fill this role by curating the data automatically from the SARB, updating a dedicated database and giving access through a simple function.

  • Exploring the Corona outbreak with R (Robert Bennetto)

    This talk is born out of the observed exponential infection growth of the Corona virus outbreak in China - which was not observed with MERS or SARS. With the underlying data and equipped with R - lets visualise the outbreak, run some simulations and understand the outbreak a little better.

17:00 17:05
Closing

Important Dates

Working up to the conference on 7 March 2020, these are the most important dates on your calendar:

Event Date
Early Bird Registration Deadline 2020-01-31
Talk Submission Deadline 2020-02-13
Official Notification of Submission Acceptance 2020-02-21
Registration Deadline 2020-02-29
Introduction to R 2020-03-04/05
R-Ladies: Hanging out with satRday Keynotes 2020-03-05
Building Successful Shiny Apps with {golem} 2020-03-06
Web Scraping with R 2020-03-06
R Package Development 2020-03-06
Conference 2020-03-07

Code of Conduct

satRday is dedicated to providing a harassment-free and inclusive conference experience for all in attendance regardless of, but not limited to, gender, gender expression, sexual orientation, employment, disabilities, physical attributes, age, ethnicity, social standing, religion or political affiliation.

Any form of harassment of participants involving anyone involved with satRday will not be tolerated. Sexual innuendos and imagery are not appropriate for any conference venue, including presentations.

Anyone violating these rules may be given a warning or expelled from the conference (without a refund) at the discretion of the conference organisers.

Our full code of conduct/anti-harassment policy can be found here.

Sponsors

Becoming a sponsor is a great opportunity to show your commitment to the continued growth and diversification of the local R community while raising brand awareness, headhunting potential talent for your organization and helping us make the conference a lasting success.

One of the key objectives of satRday is to make the conference accessible to all, including students, by keeping ticket prices low. At the same time we also want to provide attendees with a brilliant conference experience. This means we rely heavily on sponsorship.

If you'd like to come on board as a sponsor please email us and we'll gladly send you our Sponsorship Prospectus, which lays out the various options in terms of costs and benefits.