Sometime in 2008, I made a surprising finding. I was doing a routine analysis: unconstrained ordination of community composition data (I think it was DCA) and fitting mean Ellenberg indicator values (1) onto ordination diagram for ease of interpretation (using function envfit from vegan). What you get is an ordination diagram, where points are samples or species, and vectors are mean Ellenberg indicator values pointing to one or other direction, and showing which way nutrients or pH or light increases. To decide which of the six mean Ellenbergs to display, I tested the regression of axes with mean Ellenbergs using permutation tests and included only those which were significant (often most of them are, mean Ellenbergs are so good!). To my surprise, when I was looking at the ordination, it just did not make ecological sense. I knew my data, I knew where to expect more nutrients, higher pH and more light, but the arrows pointed to different directions, although they were almost all significant.
I rechecked everything again, and found the mistake – in Excel, where I sorted Ellenberg values for calculation, I sorted the species in different ways than in the species composition matrix. I actually calculated mean Ellenberg values from values of different species than species occurring in the plots. But how come that I still got so nicely significant results? I tried several times and found that indeed, even I calculate mean Ellenbergs from permuted or randomly generated species Ellenberg indicator values, I still have high chance to get a significant relationship with ordination axes, although the direction of the vectors just makes no sense. I did more experiments: I correlated mean Ellenbergs calculated from randomized indicator values with species richness or comparing differences between groups of samples in clusters from numerical classification. The same problem repeated: the chance to get significant relationship was pretty high, even though the species indicator values were just fake.
I presented this on the IAVS meeting in Crete in spring 2009 and wrote the first draft of a paper for Journal of Vegetation Science in summer 2010. It was a rather short Forum paper, but reviewers did not like the structure, so I reshaped it into a standard research article, and after three rounds of reviews (and with one of the reviewers, André Schaffers, becoming the new co-author, since he greatly helped me to make the text much readable), it was finally published in autumn 2011 as “too good to be true” paper (Zelený & Schaffers 2012). I was happy like a monkey; this was my first paper where I got my idea published, and it still remains my paper with most citations. In the paper, we suggested that one should be aware of the problem with inflated Type I error rate, and either not to test the relationship of mean Ellenberg values with ordination/species richness/cluster analysis results, or to use what we called “modified permutation test”, which compares observed relationship with those of mean Ellenberg values calculated from randomized species indicator values. I made a simple R application to do this test (MoPeT, https://davidzeleny.net/wiki/doku.php/eiv:software) and wrote a blog post about that (https://davidzeleny.net/blog/2012/09/23/ellenberg-values-too-good-to-be-true/). The story could be finished.
But it wasn’t. By playing around, I found that the same problem occurs even if I relate the mean Ellenbergs from randomised species indicator values with environmental variables, like soil pH. This started to be more serious and perplexing at the same time. Why is this happening? When relating mean Ellenbergs with ordination, one can argue that the problem is that two things which are not entirely independent are in fact tested against each other: mean Ellenbergs are calculated from indicator values and species composition, and ordination/richness/clusters are calculated from the same species composition matrix (I call this “similarity issue” in the 2012’s JSV paper). But when relating mean Ellenbergs with environmental variables, there is no such thing, environment is completely independent form species composition (Brad Hawkins, whom I met later and became co-author on his paper related to community weighted mean, calls them “extrinsic” variables, in contrast to “intrinsic”, which are derived from species composition data; Hawkins et al. 2016).
I wrote a short paper about that and in November 2013 first time submitted. I aimed high, because it was clear that this is a serious problem and many studies using community weighted means (not only Ellenberg values but also traits) are influenced by overly optimistic results. The first submission went to Ecology Letters (rejected in four days based on editorial reading), then to Journal of Ecology (this took five days for editors to reject – they thought the paper is too statistical) and finally to Methods in Ecology and Evolution, where it went for the review. The result was reject-resubmit, with one of the reviewers being Bill Shipley, who gave me quite interesting suggestions about different types of null hypotheses tested by CWM approach. I presented the idea at European Vegetation Survey workshop in Ljubljana (beautiful city!), Slovenia, in May 2014 (https://www.davidzeleny.net/doku.php/evs2014), and then again in July at International Statistical Ecology Conference in Montpellier, France (https://isec2014.sciencesconf.org/31446). There I met Stéphane Dray, whose R workshop about ade4 package I took before the conference, and later also Pedro Peres-Neto. After my presentation, we three had quite an emotional discussion in the canteen. Pedro and Stéphane showed me their almost finished manuscript about the same thing I was talking at the conference – that relationship of community weighted means with environmental variables has inflated Type I error rate if tested by standard tests.
I closed myself in the hotel for the rest of the meeting and worked hard to prepare the resubmission of the paper rejected in MEE, with only breaks watching the FIFA World cup in which Brazil lost with Germany 1:7 (I never watched football before and never after, but this one was really awesome experience: https://youtu.be/RLZUKqpXYzU). In the end, I was not that fast; I submitted the rebuttal to MEE in October and got rejected again a month later, this time with Cajo ter Braak as one of the reviewers. While Cajo was reviewing the paper, he contacted me to clarify some details and we had lively email exchange (some of the emails I wrote at night from a tent while sleeping at the tea plantation close to central Taiwan during a survey trip to a one-hectare plot in nearby forest). Again, I got useful hints on how to go further, but started to feel more and more frustrated about the fact that I cannot get the message through.
That was also the time I was about to move to Taiwan permanently to start my new life as Asian, and it took me almost one and half year to submit the new version, this time to Ecology (April 2016, rejected by editor) and again Ecology (August 2016, rejected after reviews). But, it became clear that I am late; in July 2016, the paper of Pedro, Stéphane and newly also Cajo got published in Ecography (Peres-Neto et al. 2017). First I thought let’s forget it and move on, but in the end, I remade the paper into a Synthesis format, where I mostly reviewed what is known, and brought few new things from the previous versions(2). I submitted it to the Journal of Vegetation Science in April 2017, got reject-resubmit, remade again and submitted in June 2018, got a minor revision, and, finally, got acceptance on the first October 2018. Five years from the first version, eight submissions and five different journals(3), yeah!
I was never a fast guy, at least not when it comes to writing papers. However, this one was really an extreme case and made me learn a lesson. I knew the problem, and I knew that the issue is actually quite important, but I cannot get it through and publish it. In fact, later, desperately, I started to post preprints of each submitted manuscript to BiorXiv (e.g. Zelený 2016), to get a feeling that there is at least something out there.
Looking backwards, I see the main reasons why it did not work. I was not able to crack the problem down, because I do not have this type of skill. I am not statistician or mathematician, although I often dig methodological problems, and in this case, it showed up as a handicap. Although I know about the problem of inflated Type I error rate, my solution (modified permutation test) did not work properly, which at the beginning I was not able to fully understand. Also, I spent quite a lot of time in pushing the analogy of weighted mean and compositional autocorrelation, which, clearly, most people cannot understand, although I felt it is so clear (no, it was not, but still, even now I cannot get rid of that idea). In addition, I struggled in finding the way how to structure the manuscript that does not have a standard “report” structure. I studied several other papers trying to figure out how the flow of the text should go and get inspired, but from the reaction of reviewers (complaining about my “hard to follow” text) I could see I was not able to nail it. And, at a few moments, I got a bit too emotional from the whole situation, not able to see things rationally and from distance. First, this happened when I read the paper of Otto Wildi (2016), who was criticising the 2012 JVS paper. I felt quite frustrated; not because of the critique (some of them justified, some not), but because although the paper contains several obvious misunderstandings, it could still get published(4). Another emotion showed when Peres-Neto et al. paper appeared online in Ecography. First I was even not able to read it; later I did (also because I needed to refer to it), and I found that it is actually a pretty good one, pointing clearly to the whole problem and bringing an elegant solution.
Although my interest already shifted somewhat away, I still do have some plans left with weimea (how I call weighted mean problem for myself). I hope to finish the R package (yes, weimea), which is already pretty functional (https://github.com/zdealveindy/weimea, also appended to 2018 JVS paper). And there is actually a couple of things in the previous manuscripts which never made it through to the final published version, although they are important. Hope to be faster this time…
(1) Mean Ellenberg indicator values (mEIV) are a kind of European speciality; Heinz Ellenberg, the German plant ecologist, created a set of species indicator values, where he assigned to most of the Central European species an ordinal value according to their ecological preference along main environmental gradients like nutrients, moisture or soil pH. When you make vegetation plot with a list of species occurring at a specific location and calculate mean of indicator values for species occurring in it, you get quite meaningful estimated of ecological conditions for that location, as plants interpret it. Ellenberg was not the only one who got this idea, but his system was perhaps the first one that was rather comprehensive, an important criterion for a successful application.
(2) The original acknowledgment, which was in the preprint on BiorXiv but which I later removed, reflected the whole story in short: “My thanks go to Bill Shipley, Cajo ter Braak and several anonymous reviewers for critical comments on previous versions of this manuscript, which motivated me to several times heavily rework it. Thanks also go to Pedro Péres-Neto and Stéphane Dray for (emotional) discussion of differences between my modified permutation test solution and their fourth-corner one during the ISEC 2014 conference in Montpellier.” Acknowledgements are sometimes the only place in the paper recording the background story of the paper, although, of course, seen only from the author’s point of view.
(3) List of the submitted manuscript titles, journals and dates:
- Relationship between community-weighted mean trait values and environmental variables or ecosystem properties is biased due to compositional autocorrelation (Ecology Letters, submitted Nov-2013, rejected Nov-2013);
- Relationship between community-weighted mean trait values and environmental variables or ecosystem properties is biased (Journal of Ecology, submitted Nov-2013, rejected Dec-2013);
- Relationship between community-weighted mean trait values and environmental variables or ecosystem properties is biased (Methods in Ecology and Evolution, submitted Dec-2013, rejected Feb-2014);
- Linking weighted-mean of species attributes with sample attributes: the problem of inflated Type I error rate (Methods in Ecology and Evolution, submitted Oct-2014, rejected Nov-2014);
- Bias in community-weighted mean analysis relating species attributes to sample attributes: justification and remedy (Ecology, submitted Apr-2016, rejected Apr-2016);
- Bias in community-weighted mean analysis relating species attributes to sample attributes: justification and remedy (Ecology, submitted Aug-2016, rejected Oct-2016);
- Bias in community-weighted mean analysis of plant functional traits and species indicator values (Journal of Vegetation Science, submitted Apr-2017, reject-resubmit May-2017);
- Which results of the standard test for community weighted mean approach are too optimistic? (Journal of Vegetation Science, submitted Jun-2018, minor revision Aug-2018, accepted Oct-2018).
(4) One example of the misunderstandings for all: in Zelený & Schaffers (2012) we did not say anything about Ellenberg indicator values as being biased, and still, Wildi spent considerable space in arguing that they are indeed not (because they are not measured variables), and even put this argument as a title of the paper (Why mean indicator values are not biased). What we claimed was that “results of analyses based on mean Ellenberg indicator values may be biased”, which is something I still stand for, although I would perhaps not use the term “biased” in this context anymore (and use inflated or overly optimistic instead). At first, I started to write a reply paper, but gave up when I found that I would have to spend too much energy in correcting those misunderstandings, proving that the author perhaps have not read the original paper properly. But not all was negative: Otto was right with the argument that the testing relationship of mEIV with ordination scores/richness/cluster results goes against the statistical logic, because from the definition these two variables are not independent (both are numerically in some way derived from the same species composition matrix), so the null hypothesis of non-independence is easy to be rejected. I touched on this in the 2018 JVS paper, arguing that if we see this as a case of spurious correlation, the modified permutation test (testing the modified null hypothesis) is a valid solution.
- Hawkins B.A., Leroy B., Rodríguez M.Á., Singer A., Vilela B., Villalobos F., Wang X. & Zelený D. (2017): Structural bias in aggregated trait variables driven by species co-occurrences: a pervasive problem in community and assemblage data. Journal of Biogeography, 44: 1199–1211. https://doi.org/10.1111/jbi.12953
- Peres-Neto, P.R., Dray, S. & ter Braak, C.J.F. (2017) Linking trait variation to the environment: critical issues with community‐weighted mean correlation resolved by the fourth‐corner approach. Ecography 40: 806-816. https://doi.org/10.1111/ecog.02302
- Wildi, O. (2016). Why mean indicator values are not biased. Journal of Vegetation Science, 37: 40-49. https://doi.org/10.1111/jvs.12336
- Zelený D. & Schaffers A.P. (2012): Too good to be true: pitfalls of using mean Ellenberg indicator values in vegetation analyses. Journal of Vegetation Science, 23: 419–431. https://doi.org/10.1111/j.1654-1103.2011.01366.x (free read-only pdf)
- Zelený D. (2016) Bias in community-weighted mean analysis relating species attributes to sample attributes: justification and remedy. bioRxiv. https://www.biorxiv.org/content/early/2016/04/05/046946 (this is the first version, later several times updated)
- Zelený D. (2018): Which results of the standard test for community weighted mean approach are too optimistic? Journal of Vegetation Science 29: 953-966.
https://doi.org/10.1111/jvs.12688 (free read-only pdf)