Friday, 19 June 2026

Trees: visualise results from other spatial outputs

For a long time Biodiverse has allowed the user to visualise additional values on the phylogenetic tree in a spatial analysis tab.  These include turnover of the branches in two neighbour sets, indices related to tree branched such as the weights used in a phylogenetic endemism calculation (another example is in this post).

This is very useful but before version 5 was limited to the set of lists in the spatial output being viewed.  

From version 5 of Biodiverse you can plot lists results from any spatial output across all basedatas in your project.  A big advantage of this is that you can run one analysis, for example a randomisation to generate a CANAPE output.  Later you can run a calculation to see what the relative contribution of each clade in the tree is to each analysis window, without having to rerun the whole analysis to see the new list.  This can be in a clone of the basedata so the randomisations won't be out of synch across a basedata's outputs (Biodiverse warns about this).  

This is currently implemented as a menu option below the tree and map plots.  Unfortunately this means it is not as obvious as it could be, and this is something still being worked on.  There have also been minor changes since v5 was released, but only to how the selected list names are shown.

The screenshots below show it in operation for a very simple analysis that uses every cell in the basedata.  This allows every cell in the tree to be coloured which works better as a demonstration.


The menu is at the lower right of the options below the map and tree plots.  The exact location depends on your screen size.  

Users can select from any list across all spatial outputs across all basedatas in the project.  In this case it is the PE weights in an analysis called Acacia_spatial0 in a basedata called Acacia1 (no, these are not informative names).

And the tree branches are coloured as requested.  

The lists can also be categorical outputs.  This is the results for a Range Weighted Branch Length Differences (RWiBaLD) analysis.  More details for that are in Mishler et al. 2026.

And that's pretty much it for the description.  More of the theory is discussed in the posts linked to above.  


----

Shawn Laffan

19-Jun-2026 


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  


For a list of some of the analyses Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 


You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


Biodiverse 5.99_001 development release

A new development release of Biodiverse (version 5.99_001) is now available.  

This is the first development release leading to version 6.  

Versions for Windows and Mac are available and can be accessed via https://github.com/shawnlaffan/biodiverse/wiki/Downloads

Installation instructions are at https://github.com/shawnlaffan/biodiverse/wiki/Installation

This version includes the ability to visualise label and tree branch ranges as polygons, as well as many computational efficiency improvements and GUI updates.  The list of changes is summarised at https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-5xx

For the full list of issues and changes leading to the 6.0 release, see https://github.com/shawnlaffan/biodiverse/milestone/23


Much of the documentation has now also been ported to a quarto book system.  This is much more readable than the wiki system that was previously used.  

A set of links is at https://biogeospatial.github.io/ 


----

Shawn Laffan

19-Jun-2026 


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  

For a list of some of the analyses Biodiverse has been used for, see https://biogeospatial.github.io/biodiverse-publication-list/ 

You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


Tuesday, 4 November 2025

Biodiverse version 5.0 has been released

Biodiverse version 5.0 has now been released.  

Versions for Windows and Mac are available and can be accessed via https://github.com/shawnlaffan/biodiverse/wiki/Downloads

Installation instructions are at https://github.com/shawnlaffan/biodiverse/wiki/Installation

For the full list of issues and changes leading to the 5.0 release, see https://github.com/shawnlaffan/biodiverse/milestone/18

This version includes a complete rebuild of the plotting engine (the maps, trees and matrices), as well as many computational efficiency improvements.  The list of changes is summarised at https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-50

Version 5.0 contains 1120 source code commits across 199 files.


----

Shawn Laffan

04-Nov-2025


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  

For a list of some of the analyses Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 

You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


Tuesday, 5 November 2024

Randomisations: Curveball algorithm now in Biodiverse

Biodiverse supports a range of randomisations to assess significance of analysis results.  Most use cases in the published literature use the rand_structured algorithm, which is explained in this post, but several common algorithms are supported.  

One of the design principles of Biodiverse is to give the user choice.  To that end, the curveball algorithm is available from version 5.  

The publication describing Curveball is Strona et al. (2014).  The name is derived from a baseball card trading card pastime popular in North America.  

The curveball algorithm is applied to a data set of items (species, genera, words, or some other set of identifiers).  In the common biodiversity case this is a sites by species matrix, transformed to a list of lists, e.g. a list of site lists, where each site list comprises its species (or vice versa).  These lists can be considered as sets.  At each iteration, two lists (sets of items) are randomly selected.  Any items found in both sets are ignored.  The rest can be swapped between the two sets, with the number swapped limited by the smaller number of unique items in the two sets to ensure after swapping that each set retains the same number of items it started with.  As an example, consider the case where set 1 has ten items, set 2 has eight, and there are six common items found in both lists.  This means two items can be swapped between the two lists.

The general formula for the number of possible swaps at an iteration is (min (|A|,|B|) - |A ∩ B|), where A and B are the two sets being considered, and the pipes || denote the lengths of the sets (the numbers of items they contain).   If one prefers to think in terms of dissimilarity measures where a is the number of shared items, b the number unique to set 1 and c the number unique to set 2, then the formula is (min (b,c)).  Purely as an aside, this is also part of the denominator in Simpson's dissimilarity index.  

The curveball algorithm is related to the independent swaps algorithm.  The chief advantage of curveball over independent swaps is that, because it swaps as many items as it can at each iteration, it converges on a randomised result much faster.  Curveball also avoids the main pitfall of the independent swaps algorithm where a pair can be selected that cannot be swapped, thus "wasting" an iteration (swap attempt).  

Curveball does, however, have the same issue that independent swaps has in that the user needs to specify the number of iterations over which swaps will be attempted.  Too few and the resulting matrix will not be sufficiently random.  Too many and time will be "wasted".  This is addressed in Biodiverse by optionally tracking which of the original matrix entries have been swapped, and stopping when all have been done (the stop_on_all_swapped parameter).  This has some overhead in the tracking but generally this should be balanced by the time saved by running fewer iterations overall.  For those interested, the default number of swaps is the same as for the independent swaps algorithm, which is twice the number of non-zero matrix entries (twice the sum of the lengths of all lists).

Accessing the curveball algorithm in Biodiverse is the same as for any of the randomisations.  Open the Randomisation tab, select rand_curveball as the randomise function, select the number of randomisation iterations and any other algorithm specific parameters, then press Go (see image below).  The results are in the same format as always (e.g. see here, here and here).

Since it is just another algorithm, all the common options are available (another new change in version 5 is that more options are available across all algorithms in the GUI - see issue 946).  Users can define regions that are randomised separately before reassembly for analysis, including some that are not to be randomised.  One can also add some of the randomised results to the project to inspect them.

In terms of speed, curveball is faster than rand_structured.  This is largely due to there being less book-keeping required.  However, as with independent swaps, curveball can only be applied on a per-cell basis.  It does not extend to spatially structured randomisations like rand_structured does (one could ensure swap candidates come from within some local neighbourhood, but this is a different model to something like a diffusion process or a random walk.  Update 20241109: This has been implemented and will be available in V5).

All that is needed to run the curveball algorithm is to choose rand_curveball as the "Randomise function".  Other parameters are set as usual.


And that's pretty much it for the description.  If you want to read more randomisation related blog posts then check out the posts tagged with the randomisation label.  


----

Shawn Laffan

05-Nov-2024


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  


For a list of some of the analyses Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 


You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


Monday, 4 November 2024

Plotting indices with divergent colour schemes

Many diversity indices have numerical distributions that are divergent, i.e. they are centred on some value and the interesting bit is the magnitude of the differences away from that value.  A simple example is z-scores, where the data are centre on a value of zero and the values indicate how many standard deviations above or below the expected value the input data are.   These have been plotted using a divergent scheme since version 4.1, as described here.

However, one can also have indices that are simple differences, and also ratios where 1 is the centre of the distribution, and values of 1/2 and 2 are the same magnitude difference from the centre.  The relative phylogenetic diversity and endemism indices are examples of the latter.  

From version 5, Biodiverse plots difference and ratio indices using a divergent colour scheme.    These use the same colour range as the z-scores but plotted along a continuous scale instead of as ordinal classes.  

The colouring happens automatically based on metadata stored with the indices (incidentally, the much of GUI is built using this metadata).  

Colours are also scaled so the most extreme "high" colour is equivalent to the most extreme "low" colour, i.e. if the range of difference values is -5 to 1 then the colours are assigned to the range -5 to 5, and the same for -1 to 5.  This is also accounted for when the data are log scaled or percentile trimmed to de-emphasise extreme values.  

A useful point to note is that the colour schemes can be flipped, so if one prefers blue as extreme positive values then this can be done under the Map menu at the left of the display.  

An example is below to compare the old behaviour with the new.  


Prior to version 5, ratio data were plotted using the same colour scheme as any other data, making it difficult to interpret the relative magnitude of the index values across cells.  These are the Relative Phylogenetic Diversity results for the Acacia data set of Mishler et al. (2014), scaled to emphasise the inner 90% of the distribution (i.e. the upper 5% are assigned the same colour, so too the lower 5%).  This is the interval [0.406, 0.896], which means red cells include ratios <1 which is not ideal.  Compare with the next figure.    




The same data as in the previous figure, but now using a divergent colour scheme.  Biodiverse knows this is a ratio index, so assigns colours accordingly.  Red cells have ratios exceeding 1, blue cells less than 1.  Ratios close to 1 are in yellow.  The colours are assigned to the interval [0.406,2.463], where 2.463=1/0.406.  This means one can be sure red cells have ratios exceeding 1, and there is less chance of misinterpreting the results.  





It is not shown here, but the metadata is also stored for tree-based indices so divergent colours are assigned to the tree branches where appropriate.  More details about that process are in this post.  


----

Shawn Laffan

04-Nov-2024


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  


For a list of some of the analyses Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 


You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


GUI: Polygon overlays (and underlays)

Since its first release, Biodiverse has supported plotting of polygon and polyline feature class data (from shapefiles).  The support is very basic given users can only plot the outlines of polygons, even though the colours could be changed.  

This has worked well overall, but there are times when the linework from the feature data gets in the way of the cells being plotted.  There are also times when it is useful to plot polygons as solid fills instead of just as the outline.  From version 5 of Biodiverse it is possible to do just this.  

The process is relatively simple.  If a polygon overlay is loaded then it is listed twice in the selection window, once for lines and once for solid fill (with no outline).  The default choice is polylines, which is the current behaviour.  Users then have the option of plotting one overlay above or below the cells.



Colours can be assigned in the usual way.  In this next selection window, the polygon data will be displayed below the cells using a grey colour (grey is quite useful as it does not visually dominate when coloured cells are used).  




Polygon data are displayed as a solid grey fill, under the cells.  In this case it makes it more obvious where there are unsampled regions.  (Cell outlines have also been turned off using the map menu).


Other uses for polygon overlays are in plotting ocean polygons over terrestrial cells to cover over parts of cells that are in the sea (and vice versa for marine data).  


There is no doubt more work to be done, for example plotting more than one layer at a time, but it is a useful improvement.  If more complex plotting is needed then this is when it is best to leverage the power of GIS software.  


----

Shawn Laffan

04-Nov-2024


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/  


For a list of some of the analyses Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 


You can also join the Biodiverse-users mailing list at https://groups.google.com/group/Biodiverse-users or start a discussion at https://github.com/shawnlaffan/biodiverse/discussions 


Thursday, 29 February 2024

Publications using Biodiverse in 2023

2024 is moving quickly, so here is a list of publications from 2023 that used Biodiverse.

If you want to see the full list (211 at the time of writing), then go to https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList


For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/


  • Aragón-Parada, J., Carrillo-Reyes, P., Rodríguez, A., Munguía-Lino, G., Salinas-Rodríguez, M. M. and De-Nova, J. A. (2023) Spatial phylogenetics of the flora in the Sierra Madre del Sur, Mexico: Evolutionary puzzles in tropical mountains. Journal of Biogeography, 50, 1679-1691.

  • Copilaș-Ciocianu, D., Sidorov, D., & Šidagytė-Copilas, E. (2023). Global distribution and diversity of alien Ponto-Caspian amphipods. Biological Invasions, 25, 179-195.

  • de Pedro, D., Ceccarelli, F.S., Vandame, R. et al. (2023) Congruence between species richness and phylogenetic diversity in North America for the bee genus Diadasia (Hymenoptera: Apidae). Biodiversity and Conservation, 32, 4445–4459.

  • Dlamini, W.M.D. and Loffler, L. (2023). Tree Species Diversity and Richness Patterns Reveal High Priority Areas for Conservation in Eswatini. In: Dhyani, S., Adhikari, D., Dasgupta, R., Kadaverugu, R. (eds) Ecosystem and Species Habitat Modeling for Conservation and Restoration. Springer, Singapore.

  • Erst, A.S., Baasanmunkh, S., Tsegmed, Z. et al. (2023) Hotspot and conservation gap analysis of endemic vascular plants in the Altai Mountain Country based on a new global conservation assessment. Global Ecology and Conservation, 47, e02647

  • Fernandes, N.B.G., Moraes, A.M. and Milward-de-Azevedo, M.A. (2023) Diversity of the Passiflora L. in the Serra do Mar ecoregion and the relationships with environmental gradients, South and Southeast, Brazil. Acta Botanica Brasilia, 37, e20220314.

  • Flores-Argüelles, A., López-Ferrari, A.R., & Espejo-Serna, A. (2023). Geographic distribution and endemism of Bromeliaceae from the Western Sierra-Coast region of Jalisco, Mexico. Botanical Sciences, 101, 527-543.

  • Flores-Tolentino, M. et al. (2023). Delimitación geográfica y florística de la provincia fisiográfica de la Depresión del Balsas, México, con énfasis en el bosque tropical estacionalmente seco. Revista mexicana de biodiversidad, 94, e944985.

  • Francisco-Gutiérrez, A., Eduardo Ruiz-Sanchez, E. and Lira-Noriega, A. (2023) Biogeography and conservation assessments of the species of Lamourouxia (Orobanchaceae). Acta Botanica Mexicana 130: e2213.

  • González-Orozco, C.E. (2023) Unveiling evolutionary cradles and museums of flowering plants in a neotropical biodiversity hotspot. Royal Society Open Science, 10230917230917.

  • González-Orozco, C.E., Diaz-Giraldo, R.A. and Rodriguez-Castañeda, C. (2023) An early warning for better planning of agricultural expansion and biodiversity conservation in the Orinoco high plains of Colombia. Frontiers in Sustainable Food Systems, 7.

  • González-Orozco, C., Osorio-Guarín, J., & Yockteng, R. (2023). Phylogenetic diversity of cacao (Theobroma cacao L.) genotypes in Colombia. Plant Genetic Resources, 20, 203-214.

  • González-Orozco, C.E. & Parra-Quijano, M. (2023) Comparing species and evolutionary diversity metrics to inform conservation. Diversity and Distributions, 29, 224-231.

  • González-Orozco, C. E., Reyes-Herrera, P. H., Sosa, C. C., Torres, R. T., Manrique-Carpintero, N. C., Lasso-Paredes, Z., Cerón-Souza, I. and Yockteng, R. (in press). Wild relatives of potato (Solanum L. sec. Petota) poorly sampled and unprotected in Colombia. Crop Science.

  • Guo, WY., Serra-Diaz, J.M., Eiserhardt, W.L. et al. (2023) Climate change and land use threaten global hotspots of phylogenetic endemism for trees. Nature Communications, 14, 6950.

  • Mardones, D. and Scherson, R.A. (2023) Hotspots within a hotspot: evolutionary measures unveil interesting biogeographic patterns in threatened coastal forests in Chile. Botanical Journal of the Linnean Society, 202, 433–448.

  • McCurry, M.R., Park, T., Coombs, E.J. Hart, L.J., Laffan, S. (2023) Latitudinal gradients in the skull shape and assemblage structure of delphinoid cetaceans. Biological Journal of the Linnean Society, 138, 470-480.

  • Miller, J.T., Prentice, E., Bui, E.N., Knerr, N., Mishler, B.D., Schmidt-Lebuhn, A.N., González-Orozco, C.E., Laffan, S. W. (2023). Banksia (Proteaceae) contains less phylogenetic diversity than expected in Southwestern Australia. Journal of Systematics and Evolution, 61, 957-966.

  • Molina-Paniagua, M.E., Alves de Melo, P.H., Ramírez-Barahona, S., Monro, A.K., Burelo-Ramos, C.M., Gómez-Domínguez, H., et al. (2023) How diverse are the mountain karst forests of Mexico? PLoS ONE 18, e0292352.

  • Nicolau, G.K. and Edwards, S. (2023) Diversity and endemism of Southern African Gekkonids linked with the escarpment has implications for conservation priorities. Diversity, 15, 306.

  • Ortiz-Brunel J.P., Ochoterena H., Moore M.J., Aragón-Parada J., Flores J., Munguía-Lino G., Rodríguez A., Salinas-Rodríguez M.M. and Flores-Olvera H. (2023) Patterns of Richness and Endemism in the Gypsicolous Flora of Mexico. Diversity, 15, 522.

  • Ramírez-Verdugo, P., Tapia, A., Forest, F. and Scherson, R.A. (2023) Evolutionary diversity of the endemic genera of the vascular flora of Chile and its implications for conservation. PLoS ONE 18(7): e0287957. https://doi.org/10.1371/journal.pone.0287957

  • Ruiz-Sanchez, E., Munguía-Lino, G., Pianissola, E.M., Ely, F. and Clark, L.G. (2023) Richness and endemism in Chusquea subg. Swallenochloa (Poaceae), a Neotropical subgenus adapted to temperate conditions. Phytotaxa, 609, 180-194.

  • Villaseñor, J. L., Ortiz, E., & Hernández-Flores, M. M. (2023). The vascular plant species endemic or nearly endemic to Puebla, Mexico. Botanical Sciences, 101, 1207-1221.

  • Wang, C., Zhu, S., Jiang, X., Chen, S., Xiao, Y., Zhao, Y., Yan, Y. and Wen, Y. (2023) Spatio-temporal variation of species richness and phylogenetic diversity patterns for spring ephemeral plants in northern China. Global Ecology and Conservation, 48, e02752.

  • Ye, C. et al. (2023) Geographical distribution and conservation strategy of national key protected wild plants of China. iScience, 26, 107364.

  • Zhang, H., Chen, S.-C., Bonser, S.P., Hitchcock, T., & Moles, A.T. (2023). Factors that shape large-scale gradients in clonality. Journal of Biogeography, 50, 827-837

  • Zhou, R., Ci, X., Hu, J., Zhang, X., Cao, G., Xiao, J., Liu, Z., Li, L., Thornhill, A.H., Conran, J.G. and Li, J. (2023) Transitional areas of vegetation as biodiversity hotspots evidenced by multifaceted biodiversity analysis of a dominant group in Chinese evergreen broad-leaved forests. Ecological Indicators, 147, 110001


Shawn Laffan

29-Feb-2024