Tuesday 25 October 2022

Biodiverse now calculates CANAPE for you

The CANAPE protocol is one of the analyses Biodiverse is most commonly used for (see examples amongst the list of publications using Biodiverse).  

The method, or protocol, was originally described in Mishler et al. (2014) and is conceptually simple.  Run an analysis that includes phylogenetic endemism and relative phylogenetic endemism, run those through a randomisation, and then categorise the results based on the significance score of the indices.  This process is described in more detail in previous posts here and here.  

The main issue with the approach to date is that the CANAPE classes are determined outside of Biodiverse using systems like a GIS, R code or a spreadsheet.  So while the process is conceptually simple, the actual implementation can all get a bit complex. Many users are not entirely sure which indices to pass through their functions, or even which lists to extract them from.  

As of Version 4 Biodiverse now calculates it for you.  This occurs automatically whenever an analysis has included the Phylogenetic Endemism and Relative Phylogenetic Endemism type 2 calculations.   (If you want it sooner than version 4 then it is in the development release 3.99_005, which was current at the time of writing.  See the downloads page for links).

Biodiverse now calculates the CANAPE scores when the requisite indices have been calculated, and a randomisation has been run.  Like many of the posts on this blog, this example uses the Acacia data set from Mishler et al. (2014).

How does Biodiverse store the results? 

The results are stored in a new list where the name is the randomisation output used followed by ">>CANAPE>>".  So for a randomisation called "rand" you would see "rand>>CANAPE>>".  The use of angle brackets might look a bit strange at first but makes the naming consistent with the other randomisation lists and simplifies the underlying code.

The CANAPE classes are stored in an index called CANAPE_CODE, with a numeric code indicating which of the categories a cell falls in.  Currently this code is 0 for not significant, 1 for neo-endemism, 2 for palaeo-endemism and 3 for mixed endemism.

Biodiverse also provides individual indices for neo, palaeo and mixed in the event a user only wants to see which cells are are in a specific class.  For example one might want to run a cluster analysis using only neo-endemism cells following the process described here.  


The same data as above but highlighting Palaeo-endemism cells in red.  All other cells containing data are in blue.  


A big advantage of generating CANAPE results within Biodiverse is that users can now explore the results using the functionality Biodiverse provides.  As an example, the next screen shot shows an exploration of the contribution of each clade on the tree in relation to the analysis groups (cells) (see more details about that process here and here).  

Each tree branch is coloured by the relative contribution of the clade subtending it to the PE score in the cell being hovered over (black dot in south-western WA).  This allows an understanding of which clade is driving the PE scores, and thus CANAPE, in a cell.  The visualisation process is explained in more detail here.  

Displaying the results in other systems

If you then want to use the plots as part of a map then they can be exported to an RGB Geotiff.  Details of how to do this are in another post but the next two screenshots show the start and end.  

What about a different colour scheme?

The colour scheme used is from Mishler et al. (2014) where neo is red (new is hot), palaeo is blue (old is cold) and purple is between blue and red on a colour wheel.  

If you prefer a different colour scheme then you can export the data as you normally would, for example as CSV files or as non-RGB geotiffs, and recreate the plot to your own tastes.  

Changing the colours within Biodiverse would be very useful and contributions are always welcome.

What about the Super class?  

The system does not currently generate the Super class.  It can be added if there is demand.  

Do I have to run a new randomisation analysis to see the CANAPE list?  

The CANAPE lists are generated at the end of any sequence of randomisations.  If you already have a randomisation analysis then they can be created by running one additional iteration.  

If you are concerned that your analysis is already at 999 iterations then all you lose is a bit of numeric neatness as there are now 1001 realisations in total instead of 1000 (one original plus all the random ones).  This is unlikely to make any meaningful difference once that many iterations have been run.


Shawn Laffan


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  

To see what else Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 

You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions