Category Archives: R

Unconstrained ordinations in vegan: an online Shiny app

The R package “vegan”, written and maintained by Jari Oksanen, Gavin Simpson and others, contains the whole set of functions needed for unconstrained ordination (and many other things) in R. I also experimented with other packages (“labdsv” by Dave Roberts or “ade4” by Stephane Dray et al.), but “vegan” is unbeatable when it comes to ordinations. One a little annoying thing in “vegan” is the naming convention of various ordination functions. It’s relatively easy to get used to that PCA (principal component analysis) and CA (correspondence analysis), respectively, are calculated by “rda” and “cca”, respectively, which, if one also supplies environmental variables, also calculate constrained versions (RDA aka redundancy analysis and CCA aka canonical correspondence analysis). DCA (detrended correspondence analysis) is calculated by “decorana”, which also makes sense, if you know the history of the method and the fact that first program written by Mark Hill in Fortran to calculate DCA was called exactly in this way (and in fact the “decorana” function in R is based on the original Fortran code ported into R by Jari). Somewhat harder is to remember that NMDS (non-metric multidimensional scaling) is calculated by a function called “metaMDS”, and the definite killer for me is PCoA (principal coordinate analysis), calculated by “cmdscale”. When you are teaching these to students, it may get a bit frustrating, so some handouts come handy. That’s partly why I got this idea: an interactive online tool, which would allow using few example community ecology dataset, analyse them by a set of standard ordination methods, draw resulting ordination diagram, print summary output, and mainly – show also the R code which was used to do all that. The result is here, welcome to click through!

Previously I used Shiny apps to play with visualisation of random drift, and I got quite hooked on the concept of a dynamic website with R running behind. In the case of this project, it got more complex, and I spent the whole day learning various Shiny hacks (e.g. how to modify dynamically also the clickable menu at the left panel). Although the result is far from perfect (and perhaps will stay like that for a while), it’s functional. You may choose one of the datasets (short description and the link to the website with details will appear), then choose one of six ordination methods, transform or not the community data, and decide what to display in the ordination diagram. The figure above shows the PCA on a simulated community dataset structured by one long gradient, and a lovely horse-shoe artefact which PCA chronically produces on datasets with many zeros.

Don’t get me wrong – this is not an attempt to provide an online, clickable alternative interface to “vegan”, I have no intention to do something like this. But I think that Shiny apps are a great teaching aid and visualisation tool. If you have a new R package and want to show how it works, a small Shiny example may be much better than a long online tutorial (at least as an appetiser with the aim to hook users actually to try the package).

Welcome to play, and thanks for any feedback!

Piping tibbles in tidyverse

A couple of weeks ago, while preparing for the R for ecologist class, I found tidyverse, a suite of packages (mostly written by Hadley Wickham) for various ways how to tweak data in R (thanks to Viktoria Wagner for wonderful online materials for managing vegetation data!). Somewhen in 2009 I attended useR! conference in Rennes, France, where Hadley held a workshop on using one of these packages (I think that time still called plyr). I attended it, but to be honest, I remember nothing from that workshop, it was a way too abstract for me. Since then several times when preparing data for analysis in R, I thought that I should check it again and finally learn it. And here we are, it is coming, preparation of the R class pushed me to do that finally. I also found the piping of data through the functions, with an entirely different logic of sending data from function to function*, and a new form of data frames called tibbles (no idea what the word means). All the new fancy things; R moves forward quite wildly and erratically. But this post is not about piping, tibbles, and tidyverse. It is a quick thought about me using R, how did I change, and whether I should do something with that or not.

For the first time, I used R somewhen around the year 2005 when I started to work in Brno, my previous university, and a colleague asked me to calculate and draw species curves which were hard to do anywhere else. I recall S-PLUS, a program I used as a master student when I took the class of Modern regression methods while studying in České Budějovice. Petr Šmilauer that time taught it in S-PLUS, commercial system, which turned to be R’s ancestor. Knowledge of S-PLUS came handy, and I was able to do quite pretty figures in R. That time there were just a few R packages, and no RStudio (I think I was using Tinn R editor, it still exists actually). R was not a sweet candy at the beginning, I remember quite some time spent by frustration from always occurring annoying error messages, but eventually I kind of started to like it. Later I found that the skill of using R is a powerful advantage – I was able to calculate or draw almost anything, things that would otherwise need to be done in a cumbersome way in some clickable software. That time there was just a couple of other people fancy in R. I started to teach R, which was a great way to push myself to improve, and to spread the knowledge of R among others. How different is it from today, when R is considered a lingua franca of ecologists, scientists publish papers with R codes appended, and common requirements for a newly hired research assistant, postdoc or even faculty member is a skill of using R.

But I also changed. There was a time, couple of years ago, when I was eager to learn all different new developments in R, study fancy packages, try new analytical or visualisation methods. At one time, together with Víťa, another R guy from Brno, we even taught seminaR, a class focused purely on trying new fancy things in R. But it’s gone. When it comes to R, I start to be somewhat conservative, stick to what I know and feel quite reluctant to discover new things. Partly perhaps I got lazy, but partly it has reasons. Some of those new packages, methods and “new trends” in using R, in the end, turned to be just ephemeral matter, packages were not maintained, their developers deserted them and turned interest into something else. Also, now the use of R is so broad, that it is hard to keep track of all new and exciting things. And my time is also not what it used to be; I can’t spend a week or more by trying to tweak the R code in this and that way, thinking about it day and night, excited and barely sleeping. It is not that I don’t use R anymore – actually I use it on a daily basis, at any time I can be sure that some of my computers have RStudio on, with some half-written script or some long code running. But I use it as a tool to solve some problem, and I focus on the problem itself, instead of keeping polishing my skills of using that tool. Learning R was, without doubt, one of the best investment I did in my professional life, and now I hope to move further while keep using it to do something else.

Since now every winter semester I teach R for ecologists, I keep myself somehow fit in using R in a way that I sort things to be able to explain them to students. The class is actually focused on pretty basic R stuff. Half of it we mostly draw figures, because this does not require any added theoretical background which would be needed if I use R to teaching, e.g. statistics. The second half of the class we focus on “business as usual, with R”. Loops, functions, importing, exporting and manipulating data, and some simple calculations, mostly to show students the benefit R may bring them if they use it. But that is actually the point; I do not focus on bringing new fancy things to the class, but instead, teach R oldies goldies in a way as smooth and digestible as possible.

But, piping tibbles in tidyverse let me think that it may be time to sit down for a while and recheck how R changed in the past couple of years I did not follow it. Here comes my point. Do you have some suggestions what to look for? What in R do you find useful recently that you cannot breathe without and you would suggest me to learn it? No specialised packages, something for “everyday business”. For example, I still wonder whether I should learn ggplot2 or be happy with my base R graphics combined with lattice – do you have some opinion about that? Any comments welcome!

* If you are familiar with R, than piping (library magrittr) can replace (for example) sum (log (sqrt (abs (1:10)))) by 1:10 %>% abs %>% sqrt %>% log %>% sum. Quite a revolutionary way how to assemble script together. If not familiar with R, just don’t worry about that 🙂