Mean Ellenberg indicator values: too good to be true

Zeleny-David-EVS-2012 2I started to think more intensively about Ellenberg indicator values issue at one conference while listening to the presentation, where a colleague used mean Ellenberg indicator values as explanatory variables in constrained ordination. I considered this as a kind of statistical heresy, perfect example of circularity of reasoning – you take your vegetation data, calculate mean Ellenberg indicator values for each plot, and in turn use these mean values to explain the original data. But it’s tempting – mean Ellenberg values are often considered as good proxies for measured environmental variables, and they are easy to calculate, so using them as explanatory variables is attractive. I tried that – I took a dataset with measured soil pH and calculated mean Ellenberg values for soil reaction, and compared how much variation in species data will be explained by pH and how much by mean Ellenberg; Ellenberg was a way better predictor than measured pH. Ok, so here we have the consequence of circularity. Thinking it through, I concluded that the reason is that mean Ellenberg values carry legacy of the species composition, from which they were calculated – if two plots have the same species composition, their mean Ellenberg values will be identical (considering mean not weighted by species covers), and if the species composition differ a bit, Ellenberg will change just slightly (changing one or a few numbers while calculating the mean doesn’t change the result too much).

I wondered what would happen if I reshuffle species Ellenberg values among species before calculation of mean for the vegetation sample, or if I replace the original species values by randomly generated ones. This would remove ecological meaning of the values, but keep the side effects of calculating the mean. No wonder – mean of randomized species Ellenberg values explains still more variation, when used in constrained ordination, then does random numbers (just keep in mind that every, even randomly generated variable, explains some variation – if this doesn’t make you happy, consider using adjusted R2 instead). There is no ecological information in these randomized values, so the extra explained variance is the legacy of species composition imprinted in mean Ellenberg values. I used randomization of species values as a base for modified permutation test, which can be applied for correcting the issue – not necessarily in constrained ordination, in which mean Ellenberg values are rarely used (scanning JVS and AVS journals through last ten years returned two papers), but also in unconstrained ordination, when mean Ellenberg values are correlated with ordination axes and the correlation is tested (actually this treatment is fairly common, although the problem with circularity of reasoning is exactly the same, yet not so obvious, as in constrained ordination).

I wrote an R code and later also a simple clicking program (MoPeT) running in R and calculating modified permutation tests, which is otherwise not easy to do. I presented the results in 2009 IAVS in Crete and later published a paper in JVS, together with André Schaffers, who helped me a lot in sorting the ideas and writing the manuscript. Still, I am not sure if ever somebody will really use this routine – mean Ellenberg values are great for description, but it’s perhaps better to keep them away from more sophisticated statistical treatments.

The story has actually an unexpected follow up, although I hoped that I won’t touch Ellenberg values any more. As a parody of life, I am just working on a paper describing statistical way how to justify the use of mean Ellenberg indicator values as explanatory variables in constrained ordination, and even do such things like partitioning the variation among different Ellenberg values or between Ellenberg values and measured variables. I presented this topic on EVS workshop in Vienna this spring, where it went through without any feedback – I guess the audience was even a little disgusted by such an overly technical talk. I really don’t feel like convincing somebody to use mean Ellenberg values as explanatory variables in constrained ordination, but I can’t help feeling quite fascinated by the imagination that something like this is actually possible.

Gap Light Analyser for 64 bit Windows

UPDATE on November 15, 2016: Gap Light Analyser became newly available also for 64-bit Windows machines; the installation files can be downloaded from the original GLA website. Thanks to David Wojtowicz of the University of Illinois for updating the original installer and sharing the news! As David mentions  in the comment below, he became an unofficial maintainer of this update.

On the last terrain excursion from geobotany, we were facing the problem with Gap Light Analyser – the freeware program for analysis of hemispherical photographs of tree canopy, taken by camera with fish-eye lens. It can’t be installed on 64-bit version of Windows 7, although there was no problem with 32-bit version. In the end one student had still an old 32-bit machine, so we managed to analyse the photographs, but I was wondering how to solve the problem for future. I wrote to Gordon Frazer, the author of GLA, for some hack about how to solve the situation, and he recommended to install Windows Virtual PC and with Windows XP, and run installation of GLA on it. I tried it, and seems to work fine, although it’s a bit heavy-weight, because the installation takes some time (it took me one hour to complete the whole process). You need to visit Microsoft website (http://www.microsoft.com/windows/virtual-pc/default.aspx), choose your system (suppose you have Windows 7 Professional, I haven’t tried for other versions) and download the installation packages for Virtual PC and Windows XP (for free if you have genuine Windows system, which needs to be checked online). Install it, following the instructions on the website, and finally you can run virtual PC and install GLA on it. It’s a bit slower than ordinary PC, but it works. Unfortunately it’s not an ‘easy’ solution in a sense that you can’t make this somewhere in the field without good connection to internet (the installation files are over 400 MB large). But seems like there is no other easy (and free) solution – I checked two other freeware programs for analysis of canopy images, namely  hemIMAGE and Winphot*, but both has the same issue like GLA – they can’t run on 64-bit machine (at least I have not succeeded to install them). There is also commercial software (e.g., HemiView), but I didn’t purchase it, so I have no idea if it works (I guess it should).

* There is a paper in Waldökologie from Alvaro Promis and colleagues, which compares these two packages with HemiView and GLA – that’s how I found them…

Species response curves in JUICE and R

Today I spent some time in attempt to fix the function of Species response curves in JUICE. The script for calculation of species response curves is written in R and it has problems from the whole beginning, but as my interest in using species response curves decreased, the same decreased my will for fixing the bugs. To make it short – I haven’t manage to fix it, and there are several reasons for this.

One is package gravy, which is used for calculation of HOF curves. Jari Oksanen has written this package almost 6 years ago (at least the newest version is from the year 2004) and didn’t updated it since that – and in the email two years ago he said he is not going to mantain it any more. This wouldn’t be so big trouble, but since R version 2.11 this packages no longer runs, which makes it imposible to run the script for HOF curves in newer versions of R. I tried to recompile gravy for newer versions of R, but didn’t succeed (mainly because the source code for gravy.dll dynamic library is not publically available, and I didn’t want to bother Oksanen about that).

Anyway, gravy didn’t really perform too good. The algorithm often results into ‘sharp shapes’ (which I documented in details here). I don’t really have ability to fix this, as this goes beyond my mathematical skills. However, seems that Florian Jansen was working this topic through and found some remedy – he reported it two years ago in IAVS in Stellenbosch (presentation could be found here) and he has also written R package vegdata.dev, which contains updated version of HOF functions. The packages is rather new and under development (I have seen the latest update from August 2010), so it still needs perhaps some time until it will be reliable. But this would be option to go. The packages could be downloaded from Florian’s website (basic package is vegdata, and the package vegdata.dev contains additional function, which are under development – such as those HOF curves).

So, the recent situation is like this: the function about species response curves in JUICE works fine only under R version 2.8 or lower, and it’s perhaps not going to change in recent time. In the future, I hope to remake it from gravy to vegdata package. For those seriously thinking about playing with species response curves, I would recommend use R directly, not the wrapper from JUICE; if you are not R experienced, try to go for CanodDraw for Windows, which offeres GAM and GLM models for species response curves (however, it doesn’t offer HOF). That’s it!

UPDATE (September 2012): There is an alternative, which works right now – I rewrote the original R package for JUICE into JUICE-R function, which can be run with data from JUICE, but it’s not so problematic like the previous solution. It can be found on JUICE-R website in the section Available R scripts as Species response curves. It’s based, however, on the same algorithm as the previous function in JUICE, so the problem with HOF models remains and I would not really recommend to use them…