Workshops

Image processing and computer vision with R

WORKSHOP INSTRUCTORS:

Lubomír Štěpánek, Jiří Novák

Image processing and computer vision as data-science domains can utilize tons of data since image data are one of the richest data types. Performing an image data analysis means a great possibility to employ smart algorithms such as deep learning, but it also requires a handy toolbox and sufficient computing power. Although R is sometimes not considered as a number-one language for image processing, there are options on how to effectively handle these tasks in the R environment. Furthermore, there is a large R speaking community that would like to combine image data analysis with other kinds of analyses, and they would welcome an option to keep all their code “under one roof” within R. In this workshop, using a hands-on approach, we will firstly revisit currently available packages such as magick, imager, EBImage (and some others), and their functionalities, which deal mainly with image processing and are written purely in R (and for R). A couple of time will be dedicated to amazing Bnosac’s family of R packages enabling computer vision and some other algorithmic tasks, e. g. objects or face detection and recognition. Finally, we would like to go a bit deeper into state-of-the-art possibilities and the bridging of the R environment to non-R-based libraries using API packages. For instance, a connection between R and C++ library dlib using R dlib package or using openCV library via ROpenCVLite R package.

Advanced User Interfaces for Shiny Developers

WORKSHOP INSTRUCTORS:

David Granjon, Mustapha Larbaoui, Flavio Lombardo, Douglas Robinson

In the past two years, there were various Shiny focused workshops introducing basic as well as advanced topics such as modules and Javascript/R interactions. However, handling advanced user interfaces was never an emphasis. Clients often desire custom designs, yet this generally exceeds core features of Shiny. We recognized that R App developers lacking a significant background in web development may have found this requirement to be overwhelming. Consequently, the aim of this workshop is to provide participants the necessary knowledge to extend Shiny’s layout, input widgets and include the new components in a novel modular framework. The workshop is organized into four parts. We first dive into the {htmltools} package, providing functions to create and manipulate shiny tags as well as manage dependencies. We then go through the basics of JavaScript, especially jQuery. Part 3 homes in on the development of a new template on top of Shiny by demonstrating examples from the {bs4Dash} and {shinyMobile} packages. Finally, we integrate these new components into shiny modules, designed with an object-oriented-based package, namely {tidymodules}. This novel approach allows the development of Shiny apps using R6 classes, thereby significantly reducing the burden of modular communication faced by classic shiny modules and their namespace management.

Explanation and exploration of machine learning models with R

WORKSHOP INSTRUCTORS:

Przemyslaw Biecek, Hubert Baniecki

It is easy to fit a predictive model, but how to explain its decisions or audit its structure? Black box model may perform very well on test data but fail spectacularly after deployment. During the tutorial I will overview problems with black-box models and then I will overview techniques for model explanation, exploration and debugging. You will get practical mixture of theory and hands-on applications for interpretable and explainable machine learning. This workshop is designed for applied data scientists interested in predictive machine learning models. No specific background is assumed, although some prior experience with predictive modeling will be very helpful. Basic knowledge of R is assumed. If you know how to (1) read data and (2) train any predictive model and (3) call the predict function, then you will be safe. To get 100% out of this workshop you would also need: (4) basic knowledge of the ggplot2 plots, (5) some visual literacy to read model visualizations. This tutorial will be based on materials available here: https://github.com/pbiecek/PM_VEE

Non-disclosive federated analysis in R

WORKSHOP INSTRUCTORS:

Patricia Ryser-Welch, Paul Burton, Demetris Avraam, Stuart Wheater, Olly Butters, Becca Wilson, Alex Westerberg, Leire Abarrategui-Martinez

The analysis of individual person-level data is often crucial in the biomedical and sciences. But ethical, legal, and regulatory restrictions often provide significant, though understandable and socially responsible, impediments to the sharing of individual-level data: particularly when those data are sensitive as is often the case with health data. This creates important challenges for the development of appropriate architectures for federated analysis systems, the associated R programming techniques, and the visualization of the data. In this workshop, we first introduce how an architectural approach can build non-disclosive federated analysis systems. Secondly, we present some practical exercises to illustrate the concepts of non-disclosive programming techniques in R. Finally, we discuss and provide some concrete examples of non-disclosive visualization techniques.

Reproducible workflows with the RENKU platform

WORKSHOP INSTRUCTORS:

Christine Choirat, The Renku Development Team

Communities and funding sources are increasingly demanding reproducibility in scientific work. There are now a variety of tools available to support reproducible data science, but choosing and using one is not always straightforward. In this tutorial, we present RENKU (https://renkulab.io/): an open-source platform integrating git, Jupyter/RStudio Server, Docker, analysis workflows linked with a queryable knowledge graph. With RENKU, every step of the data science research that generates new code or data is preserved by the git version-control system. This allows scientists to step backwards through the history of their research and retrieve earlier versions of their methods and results. RENKU materializes data science recipes and data lineage into a knowledge representation based on the Common Workflow Language standards (CWL), and PROV-O ontology. Data lineage is automatically recorded and workflow captured within and across RENKU projects, allowing derived data and results to be unambiguously traced back to original raw data sources through all intermediate processing steps. Because data lineage includes the code of intermediate data transforms or data analytics, research is reproducible. RENKU projects evolve asynchronously into a web of interwoven threads, so the output of one analysis becomes the input of another. The knowledge representation can even become the object of data analytics; for example, the popularity of research can be ranked, the system can learn about one’s research interests from the data and methods used, and recommends other research activities that it finds comparable, or generate customized alerts when new insights are discovered in relevant scientific research. From an end-user perspective, through a unique authentication mechanism, the platform provides a seamless integration of a Jupyter notebook server, git version control, git LFS to handle data, continuous integration via GitLab, containerization via Docker images that can be reused and shared, and automatic CWL workflow generation via a knowledge graph.

A unified approach for writing automatic reports: parameterization and generalization of R-Markdown

WORKSHOP INSTRUCTORS:

Cristina Muschitiello, Niccolò Stamboglis

Modern data science, both in industry or academia, requires reproducibility of every aspect of a data pipeline, from data cleaning to analysis. We show how a parametrized R-markdown report can be integrated into R packages to ensure reproducibility, while guaranteeing customisation needs. Indeed, parameterized R markdown documents offer the possibility to customize any analysis via the changing of parameters’ values in the YAML section of a document. Our “super-parameterized” programming approach for report development consists in a Parameterized R markdown document, embedded in a function that creates parameters based on user-defined functions and computations to be recalled in a Markdown document through a render function. This approach offers an enhanced generalizability to the writing of markdown documentation, while allowing for customization requirements. Using data on antibiotic prescriptions in England and making use of Donald Knuth’s literate programming approach that combines code and documentation, we show how the R Markdown authoring framework can be used to develop and document data cleaning and analysis. Participants to this workshop will learn how to develop their own functions with embedded and parameterized R-Markdown reports.

Build a website with blogdown in R

WORKSHOP INSTRUCTORS:

Tatjana Kecojevic, Katarina Kosmina, Tijana Blagojev

The workshop is applicable to anyone who want to leverage R’s flexibility to generate static websites using open source technologies through the RStudio.

The workshop will provide a practical guide for crateing websites using the blogdown package in R that allows you to create websites from R markdown files using Hugo, an open-source static site generator written in Go. In this workshop you will learn how to create dynamic R Markdown documents to build static websites allowing you to use R code to render the results of your analysis. The blogdown through the use of R Markdown allows technical writing allowing you to add graphs, tables, LaTeX equations, theorems, citations, and references. This makes blogdown a perfect tool for designing websites to communicate your R data story telling or just awesome general-purpose websites. After you create your awesome website using the blogdown and HUGO template you will push it onto your GitHub and deploy on Netlify all free of charge.

Objectives:
– Apply HUGO theme to create a website using the blogdown
– Perosnalise the website
– Deploy site from your computer to the Internet
– Use GitHub for version control
– Use Netlify for continuous deployment
– Updating your website: serving site, push to GitHub, deploy


This will be a hands-on workshop where students are expected to work simultaneously with the tutor. The workshop will be delivered in two parts. Part I will focus on introducing the blogdown package and familiarising you with the structure of HUGO’s website templates. You will build your first website and make some basic changes that would personalise your website. Once you have created your website you will learn how to add the post, how to customise it and finally how to deploy it.

Indicative programe:

Part I: Creating websites using the blogdown and HUGO
– Introducing the blogdown and HUGO (5 min)
– set up a working environment (10min)
– install/update packages
– connect to GitHub (10min)
– Building a website (50min)
– understanding Hugo content:subdirectories/pages
– Personalise the website
– Q/A time (15min)


Part II: Making changes and deployment – Adding posts to your website (30min)
– Adding images and tables to your posts (15min)
– Customise the look (15min)
– Deploying in Netlify through GitHub (15min)
– Hoste your site as rbind subdomain (5min)
– Q/A time(10min)


You will be able to continu your blogdown learning journey by checking full blogdown documentation freely available from [blogdown: Creating Websites with R Markdown book](https://bookdown.org/yihui/blogdown/).

Bring your R Application Safely to Production. Collaborate, Deploy, Automate.

WORKSHOP INSTRUCTORS:

Riccardo Porreca, Peter Schmid

If you are looking for a hands-on introduction to CI/CD pipelines with a structured approach to collaborative development, this is the workshop you are looking for. To get your head around pull requests and branches, travis.yml files, workflows and processes for controlled development and deployments using free open source tools, all you need is your laptop (with R 3.6.x, RStudio and Git installed), a GitHub account and joining us.

In this hands-on 3-hours session we will: highlight the benefits of versioning your R code with Git and GitHub; show how to set up automated controls of your development on Travis CI, leveraging unit tests and R’s built-in package checks for Continuous Integration (CI); discuss effective branching models, pull requests and branch protection on GitHub, an approach especially important in a collaborative environment; demonstrate how to set up Continuous Deployment (CD) of packaged applications via Travis CI/GitHub Actions.

A simple R package with a Shiny app will be used as a running example to lay out a full workflow for stable, secure, reproducible deployments and releases.

This workshop is the natural continuation of ”Is R ready for Production? Let’s develop a Professional Shiny Application!”, however, attendance to the previous workshop is not a mandatory prerequisite.

Is R ready for Production? Let’s develop a Professional Shiny Application!

WORKSHOP INSTRUCTORS:

Andrea Melloncelli

R and Shiny are a good way to make a pilot program, which is a feasibility study in a small-scale: short-term and cheap experiments that help an organization learn if this project can be useful for their business. In this workshop we learn how to build a pilot application that can be easily brought in production working on real data on a larger scale. We well code a Shiny application with the code organized in different Shiny Modules (“Shiny”) (clear and maintainable code) installed as a package (“Golem”), create unit tests (“Testthat”) to avoid regressions (TDD or TAD). Good logging system is useful to see what happens on your server: find bugs and discover user usage paths (“futile.logger”). Configuration files make the application more adaptable. Track the package environment to be reproducible (“renv”). Easy deploy on a Linux server using “Git”.

Required skills:

This workshop will point out how to create a robust design and how to easily install this application on a (Cloud) server, it will not about what a Shiny Application is.

How to build htmlwidgets

WORKSHOP INSTRUCTORS:

Jean-Philippe Coene

The htmlwidgets package enables the integration of JavaScript visualisation with R. The htmlwidgets.org website provides some basic guides which are not all that clear and do not cover everything. As a result building an htmlwidget remains daunting, particularly given most R users have little JavaScript knowledge.

But truth be told htmlwidgets tend to come with very little JavaScript code as many JavaScript visualisation libraries can be instantiated with as little as a single line of code.

The lack of documentation combined with a lack of knowledge of JavaScript makes htmlwidgets uncanningly striking but once all is demystified R developers are quick to see what it can bring to their work. I believe a workshop along the lines of “How to build htmlwidgets” could greatly benefit the community.

During the workshop I would propose walking users through building an htmlwidget from scratch (building together); it is very rewarding and encouraging to end up with working visualisation package. We’d cover everything all the way to proxy integrations with Shiny. Building an htmlwidget from scratch leads to many epiphanies as one comes to realise how easy it is to actually put together.

One would however need some basic understanding of how to develop an R package, as htmlwidgets take that form.

Semantic Web in R for Data Scientists

WORKSHOP INSTRUCTORS:

Goran Milovanović

This workshop will offer a hands-on approach to Semantic Web technologies in R by exemplifying how to work with Wikidata and DBpedia in different ways. Attendees of the workshop should be R developers who understand the typical ways of dealing with familiar data structures like dataframes and lists. The workshop will be supported by a well documented, readable code in a dedicated GitHub repo. 

The plan is to start simple (using the WIkidata API, for example) and then slowly progress towards more advanced topics (e.g. your first SPARQL query from R and why it is not as complicated as people think, matching your data set against Wikidata entities in order to enrich it, and similar). I will provide an introduction to Semantic Web on a conceptual level only so that participants will not need a full understanding of the related technical standards (RDF, different serializations, etc) to follow through. 

Finally, we will show how to process the Wikidata JSON dump from R for those interested to play big games with R and the Semantic Web. We might be playing around with some interactive graph visualizations during the workshop. I think that Semantic Web is a new topic for many Data Scientist and that the R world definitely deserves a better introduction to it than it already has.