tag:blogger.com,1999:blog-76689825908357320652024-03-20T00:10:24.083+11:00Biodiverse analysis softwareShawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.comBlogger71125tag:blogger.com,1999:blog-7668982590835732065.post-85167785328259487872024-02-29T09:19:00.001+11:002024-02-29T09:19:47.786+11:00Publications using Biodiverse in 2023<p>2024 is moving quickly, so here is a list of publications from 2023 that used Biodiverse.</p><p>If you want to see the full list (211 at the time of writing), then go to <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList" target="_blank">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a></p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/" target="_blank">http://shawnlaffan.github.io/biodiverse/</a></p><p><br /></p><p>
</p><ul>
<li>
<p>Aragón-Parada, J., Carrillo-Reyes, P., Rodríguez, A., Munguía-Lino,
G., Salinas-Rodríguez, M. M. and De-Nova, J. A. (2023) Spatial
phylogenetics of the flora in the Sierra Madre del Sur, Mexico:
Evolutionary puzzles in tropical mountains. <a href="https://doi.org/10.1111/jbi.14679" rel="nofollow">Journal of Biogeography, 50, 1679-1691</a>.</p>
</li>
<li>
<p>Copilaș-Ciocianu, D., Sidorov, D., & Šidagytė-Copilas, E. (2023).
Global distribution and diversity of alien Ponto-Caspian amphipods. <a href="https://doi.org/10.1007/s10530-022-02908-1" rel="nofollow">Biological Invasions, 25, 179-195</a>.</p>
</li>
<li>
<p>de Pedro, D., Ceccarelli, F.S., Vandame, R. et al. (2023) Congruence
between species richness and phylogenetic diversity in North America for
the bee genus Diadasia (Hymenoptera: Apidae). <a href="https://doi.org/10.1007/s10531-023-02706-8" rel="nofollow">Biodiversity and Conservation, 32, 4445–4459</a>.</p>
</li>
<li>
<p>Dlamini, W.M.D. and Loffler, L. (2023). Tree Species Diversity and
Richness Patterns Reveal High Priority Areas for Conservation in
Eswatini. In: Dhyani, S., Adhikari, D., Dasgupta, R., Kadaverugu, R.
(eds) <a href="https://doi.org/10.1007/978-981-99-0131-9_8" rel="nofollow">Ecosystem and Species Habitat Modeling for Conservation and Restoration</a>. Springer, Singapore.</p>
</li>
<li>
<p>Erst, A.S., Baasanmunkh, S., Tsegmed, Z. et al. (2023) Hotspot and
conservation gap analysis of endemic vascular plants in the Altai
Mountain Country based on a new global conservation assessment. <a href="https://doi.org/10.1016/j.gecco.2023.e02647" rel="nofollow">Global Ecology and Conservation, 47, e02647</a></p>
</li>
<li>
<p>Fernandes, N.B.G., Moraes, A.M. and Milward-de-Azevedo, M.A. (2023)
Diversity of the Passiflora L. in the Serra do Mar ecoregion and the
relationships with environmental gradients, South and Southeast, Brazil.
<a href="https://doi.org/10.1590/1677-941X-ABB-2022-0314" rel="nofollow">Acta Botanica Brasilia, 37, e20220314</a>.</p>
</li>
<li>
<p>Flores-Argüelles, A., López-Ferrari, A.R., & Espejo-Serna, A.
(2023). Geographic distribution and endemism of Bromeliaceae from the
Western Sierra-Coast region of Jalisco, Mexico. <a href="https://doi.org/10.17129/botsci.3169" rel="nofollow">Botanical Sciences, 101, 527-543</a>.</p>
</li>
<li>
<p>Flores-Tolentino, M. et al. (2023). Delimitación geográfica y
florística de la provincia fisiográfica de la Depresión del Balsas,
México, con énfasis en el bosque tropical estacionalmente seco. <a href="https://doi.org/10.22201/ib.20078706e.2023.94.4985" rel="nofollow">Revista mexicana de biodiversidad, 94, e944985</a>.</p>
</li>
<li>
<p>Francisco-Gutiérrez, A., Eduardo Ruiz-Sanchez, E. and Lira-Noriega,
A. (2023) Biogeography and conservation assessments of the species of <em>Lamourouxia</em> (Orobanchaceae). <a href="https://doi.org/10.21829/abm130.2023.2213" rel="nofollow">Acta Botanica Mexicana 130: e2213</a>.</p>
</li>
<li>
<p>González-Orozco, C.E. (2023) Unveiling evolutionary cradles and
museums of flowering plants in a neotropical biodiversity hotspot. <a href="http://doi.org/10.1098/rsos.230917" rel="nofollow">Royal Society Open Science, 10230917230917</a>.</p>
</li>
<li>
<p>González-Orozco, C.E., Diaz-Giraldo, R.A. and Rodriguez-Castañeda, C.
(2023) An early warning for better planning of agricultural expansion
and biodiversity conservation in the Orinoco high plains of Colombia. <a href="https://doi.org/10.3389/fsufs.2023.1192054" rel="nofollow">Frontiers in Sustainable Food Systems, 7</a>.</p>
</li>
<li>
<p>González-Orozco, C., Osorio-Guarín, J., & Yockteng, R. (2023).
Phylogenetic diversity of cacao (Theobroma cacao L.) genotypes in
Colombia. <a href="https://doi.org/10.1017/S1479262123000047" rel="nofollow">Plant Genetic Resources, 20, 203-214</a>.</p>
</li>
<li>
<p>González-Orozco, C.E. & Parra-Quijano, M. (2023) Comparing
species and evolutionary diversity metrics to inform conservation. <a href="https://doi.org/10.1111/ddi.13660" rel="nofollow">Diversity and Distributions, 29, 224-231</a>.</p>
</li>
<li>
<p>González-Orozco, C. E., Reyes-Herrera, P. H., Sosa, C. C., Torres, R.
T., Manrique-Carpintero, N. C., Lasso-Paredes, Z., Cerón-Souza, I. and
Yockteng, R. (in press). Wild relatives of potato (<em>Solanum</em> L. sec. <em>Petota</em>) poorly sampled and unprotected in Colombia. <a href="https://doi.org/10.1002/csc2.21143" rel="nofollow">Crop Science</a>.</p>
</li>
<li>
<p>Guo, WY., Serra-Diaz, J.M., Eiserhardt, W.L. et al. (2023) Climate
change and land use threaten global hotspots of phylogenetic endemism
for trees. <a href="https://doi.org/10.1038/s41467-023-42671-y" rel="nofollow">Nature Communications, 14, 6950</a>.</p>
</li>
<li>
<p>Mardones, D. and Scherson, R.A. (2023) Hotspots within a hotspot:
evolutionary measures unveil interesting biogeographic patterns in
threatened coastal forests in Chile. <a href="https://doi.org/10.1093/botlinnean/boad002" rel="nofollow">Botanical Journal of the Linnean Society, 202, 433–448</a>.</p>
</li>
<li>
<p>McCurry, M.R., Park, T., Coombs, E.J. Hart, L.J., Laffan, S. (2023)
Latitudinal gradients in the skull shape and assemblage structure of
delphinoid cetaceans. <a href="https://doi.org/10.1093/biolinnean/blac128" rel="nofollow">Biological Journal of the Linnean Society, 138, 470-480</a>.</p>
</li>
<li>
<p>Miller, J.T., Prentice, E., Bui, E.N., Knerr, N., Mishler, B.D.,
Schmidt-Lebuhn, A.N., González-Orozco, C.E., Laffan, S. W. (2023). <em>Banksia</em> (Proteaceae) contains less phylogenetic diversity than expected in Southwestern Australia. <a href="https://doi.org/10.1111/jse.13019" rel="nofollow">Journal of Systematics and Evolution, 61, 957-966</a>.</p>
</li>
<li>
<p>Molina-Paniagua, M.E., Alves de Melo, P.H., Ramírez-Barahona, S.,
Monro, A.K., Burelo-Ramos, C.M., Gómez-Domínguez, H., et al. (2023) How
diverse are the mountain karst forests of Mexico? <a href="https://doi.org/10.1371/journal.pone.0292352" rel="nofollow">PLoS ONE 18, e0292352</a>.</p>
</li>
<li>
<p>Nicolau, G.K. and Edwards, S. (2023) Diversity and endemism of
Southern African Gekkonids linked with the escarpment has implications
for conservation priorities. <a href="https://doi.org/10.3390/d15020306" rel="nofollow">Diversity, 15, 306</a>.</p>
</li>
<li>
<p>Ortiz-Brunel J.P., Ochoterena H., Moore M.J., Aragón-Parada J.,
Flores J., Munguía-Lino G., Rodríguez A., Salinas-Rodríguez M.M. and
Flores-Olvera H. (2023) Patterns of Richness and Endemism in the
Gypsicolous Flora of Mexico. <a href="https://doi.org/10.3390/d15040522" rel="nofollow">Diversity, 15, 522</a>.</p>
</li>
<li>
<p>Ramírez-Verdugo, P., Tapia, A., Forest, F. and Scherson, R.A. (2023)
Evolutionary diversity of the endemic genera of the vascular flora of
Chile and its implications for conservation. PLoS ONE 18(7): e0287957. <a href="https://doi.org/10.1371/journal.pone.0287957" rel="nofollow">https://doi.org/10.1371/journal.pone.0287957</a></p>
</li>
<li>
<p>Ruiz-Sanchez, E., Munguía-Lino, G., Pianissola, E.M., Ely, F. and Clark, L.G. (2023) Richness and endemism in <em>Chusquea</em> subg. <em>Swallenochloa</em> (Poaceae), a Neotropical subgenus adapted to temperate conditions. <a href="https://doi.org/10.11646/phytotaxa.609.3.2" rel="nofollow">Phytotaxa, 609, 180-194</a>.</p>
</li>
<li>
<p>Villaseñor, J. L., Ortiz, E., & Hernández-Flores, M. M. (2023).
The vascular plant species endemic or nearly endemic to Puebla, Mexico. <a href="https://doi.org/10.17129/botsci.3299" rel="nofollow">Botanical Sciences, 101, 1207-1221</a>.</p>
</li>
<li>
<p>Wang, C., Zhu, S., Jiang, X., Chen, S., Xiao, Y., Zhao, Y., Yan, Y.
and Wen, Y. (2023) Spatio-temporal variation of species richness and
phylogenetic diversity patterns for spring ephemeral plants in northern
China. <a href="https://doi.org/10.1016/j.gecco.2023.e02752" rel="nofollow">Global Ecology and Conservation, 48, e02752</a>.</p>
</li>
<li>
<p>Ye, C. et al. (2023) Geographical distribution and conservation strategy of national key protected wild plants of China. <a href="https://doi.org/10.1016/j.isci.2023.107364" rel="nofollow">iScience, 26, 107364</a>.</p>
</li>
<li>
<p>Zhang, H., Chen, S.-C., Bonser, S.P., Hitchcock, T., & Moles,
A.T. (2023). Factors that shape large-scale gradients in clonality. <a href="https://doi.org/10.1111/jbi.14577" rel="nofollow">Journal of Biogeography, 50, 827-837</a></p>
</li>
<li>
<p>Zhou, R., Ci, X., Hu, J., Zhang, X., Cao, G., Xiao, J., Liu, Z., Li,
L., Thornhill, A.H., Conran, J.G. and Li, J. (2023) Transitional areas
of vegetation as biodiversity hotspots evidenced by multifaceted
biodiversity analysis of a dominant group in Chinese evergreen
broad-leaved forests. <a href="https://doi.org/10.1016/j.ecolind.2023.110001" rel="nofollow">Ecological Indicators, 147, 110001</a></p>
</li></ul><p><br /></p><p>Shawn Laffan</p><p>29-Feb-2024</p><p><br /></p><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-24910505428891202832024-02-03T16:14:00.000+11:002024-02-03T16:19:46.796+11:00Map side menu: The tree plot controls are now a separate submenu, and some new features<p>From version 5 of Biodiverse the tree plot controls in the left side menu are now their own submenu. This greatly simplifies the interface.</p><p>The displayed tree can now be exported, including the colours used when plotting the tree. Previously the colours were not stored so this was not possible. To export the colours corresponding to a specific cell then right click on that cell to fix the colouring in place. This stops any further updates until another cell is clicked on. The interface itself is unchanged, including the options to <a href="https://biodiverse-analysis-software.blogspot.com/2017/04/biodiverse-now-exports-tree-branch.html" target="_blank">export the colours</a> and an<a href="https://biodiverse-analysis-software.blogspot.com/2019/05/reproduce-spatial-plots-with-same.html" target="_blank"> RGB geotiff of the spatial plot</a>. </p><p>In addition, there are several new plotting options that allow one to plot using equal and range weighted branch lengths. </p><p> </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiI1rNdkf8a30KoSEqrQ13yE2nArclLdBS1CjRePdZZEDdLXNxIoOR-xCNmW3jkDVa2Am_GH3uuXIRR-hJejmdKkUMxN4LS92vae6D_lXo0-8oNOeRBAwZdsbDtfavvbrtCn91Fn3RGSesj0N5vUDzUddouLEdGk6R2PsHbYn1divYbzCs2CGsv4KRAVT2L/s1920/image2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1920" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiI1rNdkf8a30KoSEqrQ13yE2nArclLdBS1CjRePdZZEDdLXNxIoOR-xCNmW3jkDVa2Am_GH3uuXIRR-hJejmdKkUMxN4LS92vae6D_lXo0-8oNOeRBAwZdsbDtfavvbrtCn91Fn3RGSesj0N5vUDzUddouLEdGk6R2PsHbYn1divYbzCs2CGsv4KRAVT2L/w640-h340/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">It is now possible to plot the tree using normal branch lengths, depth and also equal branch length and range weighted. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjd8UUHDG_MvALwoB35FsS08kgtt2Q9lIhyphenhyphenvTv7pOoRF8al8UtbSTTIEn5vEobtbmqAxv_6havLVLI0Pi_yZD_5qRCYidkbneGpk_kWZzUBcDRe8YZsdzjWBwlDHXVsj1fQbXt7UdNpJLVUt8yHw3PxDScB6MSS4AwqRZDDk785g9ONl4uBNmkaUrVX-gIn/s1920/image3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1920" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjd8UUHDG_MvALwoB35FsS08kgtt2Q9lIhyphenhyphenvTv7pOoRF8al8UtbSTTIEn5vEobtbmqAxv_6havLVLI0Pi_yZD_5qRCYidkbneGpk_kWZzUBcDRe8YZsdzjWBwlDHXVsj1fQbXt7UdNpJLVUt8yHw3PxDScB6MSS4AwqRZDDk785g9ONl4uBNmkaUrVX-gIn/w640-h340/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The equal branch length tree is the alternate tree in the CANAPE protocol</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirpcOm7bgrWSS_yJVeiZlS82J6qT-w2wANM5_t6TjO_30FL3dOkFCX0XBAtNYLFi8x708zuVMO5En7bkDbxu0Ofmgzr9dpHARhxW242wZqWZL52Ymzqk7DzISQeh3kdr3-ow-jImrpLPYki4iBqYgdIu9gtoNdGFVJWBqlr6A4UgRVWn53wYftuwZws8qp/s1920/image4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1920" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirpcOm7bgrWSS_yJVeiZlS82J6qT-w2wANM5_t6TjO_30FL3dOkFCX0XBAtNYLFi8x708zuVMO5En7bkDbxu0Ofmgzr9dpHARhxW242wZqWZL52Ymzqk7DzISQeh3kdr3-ow-jImrpLPYki4iBqYgdIu9gtoNdGFVJWBqlr6A4UgRVWn53wYftuwZws8qp/w640-h340/image4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The range weighted tree can be used to understand how PE works. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9MXfNZoa3CizlCEMRioZ7iGmb8PBgqb2lw2C5a-r99cGe9pHxuT9mhyphenhyphenZWMUpacVUh0taQfatmA9INXPRThbv4h78jg7MIyApTZnTRLhx2VN0EMWfGLLZ6H4D9n5rQfKJ3oNa7CxxjS0OP-AUAYxC9Fhfnu4cQ6okFwp5-1BKYxjDPKJ5IK45H8aj72PV/s1920/image5.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1920" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9MXfNZoa3CizlCEMRioZ7iGmb8PBgqb2lw2C5a-r99cGe9pHxuT9mhyphenhyphenZWMUpacVUh0taQfatmA9INXPRThbv4h78jg7MIyApTZnTRLhx2VN0EMWfGLLZ6H4D9n5rQfKJ3oNa7CxxjS0OP-AUAYxC9Fhfnu4cQ6okFwp5-1BKYxjDPKJ5IK45H8aj72PV/w640-h340/image5.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The ranged weighted equal branch length tree can be used to understand the RPE index used in CANAPE. </td></tr></tbody></table><br /><div><div>----</div><div><p>Shawn Laffan</p><p>03-Feb-2024</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div></div><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-30504563774545487862024-02-03T15:24:00.011+11:002024-02-03T16:19:44.859+11:00Tree panels: colour the tree using any list from spatial outputs across the project <p>It has long been possible to colour the tree branches in the spatial tab. However, it was only possible to use a list from the spatial analysis being plotted. </p><p>From version 5 the interface has been changed to enable selection of any list from spatial outputs across the project. Where before the system had a simple drop down list, it is now a menu with submenus for each basedata and then each of its spatial outputs. </p><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7UjJG0snywOUrle3RajoZmmob6EHLB6Zl8Xsa2RehFx-Q19K-hz0u3LZF75M-RH8DZZxuYxIEmX9OOdxQxmXH0g0K7cl1e7dEBPYNUZndNHFaUAdoVSGxPv9WCfeqyzlEjwLNRE6nuZaDVp8k8JioYP0aj7dKr5ldn03aRVX6n6_sumnCAvlIR0BANl9Y/s1920/image2.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1920" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7UjJG0snywOUrle3RajoZmmob6EHLB6Zl8Xsa2RehFx-Q19K-hz0u3LZF75M-RH8DZZxuYxIEmX9OOdxQxmXH0g0K7cl1e7dEBPYNUZndNHFaUAdoVSGxPv9WCfeqyzlEjwLNRE6nuZaDVp8k8JioYP0aj7dKr5ldn03aRVX6n6_sumnCAvlIR0BANl9Y/w640-h340/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The tree colour list selection is now a menu that allows users to choose any list across all spatial outputs in the project </td></tr></tbody></table><br /><p>As with most widgets in the GUI, the menu entries are described in the tooltip. That text is duplicated below. </p><p>The first (default) option shows the paths connecting the labels in the neighbour sets used for the analysis. When there is one such set all branches are coloured blue. When there are two such sets blue denotes branches only in the first set, red denotes those only in the second set, and black denotes those in both. From these one can see the turnover of branches between the groups (cells) in each neighbour set.</p><p>The next set of menu options are list indices in the spatial output that belongs to this tab. The remainder are lists across other spatial outputs in the project, organised by their basedata objects. These are in the same order as in the Outputs tab. Basedatas and outputs with no list indices are not shown.</p><p>If a branch is not in the list then it is highlighted using a default colour (usually black). If the selected output has no labels that are also on the tree then no highlighting is done (all branches remain black).</p><p>Right clicking on a group (cell) fixes the highlighting in place, stopping changes to the branch colouring as the mouse is hovered over other groups. This allows the tree to be exported with the current colouring (<a href="https://biodiverse-analysis-software.blogspot.com/2024/02/map-side-menu-tree-plot-controls-are.html" target="_blank">another new option in version 5</a>).</p><div><div>----</div><div><p>Shawn Laffan</p><p>03-Feb-2024</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div></div><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-72029974421581627282024-02-03T14:59:00.002+11:002024-02-03T16:19:43.051+11:00Trimming basedatas has been generalisedIt has long been possible to trim the basedata labels to keep only those that match either the selected tree or selected matrix. <div><br /></div><div>From Version 5 (actually 4.99_002 if you like development versions) it is possible to trim using a different basedata. The interface has also been generalised in the process. <div><br /></div><div>There's not much to it, so here are some screenshots to demonstrate the process. </div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCMYN5Zivz1rN2Bss7IL0FJYjAOyOeEvY4kgF4o1QWQ7tJW1MUm-NjrwwcmAJEo70IWypJGKm8Eo1qlk5yKoQmb5u9YK4_fz_diBIu3vH6dckQZdkfRTVFBjSLQejxGQfCxe5oIk_9r3_LJ4P5g4DoI1D_pijyb4jvHDaMnTXwgzn1NZS5JWFNfAuAr2pd/s320/menu_cropped.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="320" data-original-width="279" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCMYN5Zivz1rN2Bss7IL0FJYjAOyOeEvY4kgF4o1QWQ7tJW1MUm-NjrwwcmAJEo70IWypJGKm8Eo1qlk5yKoQmb5u9YK4_fz_diBIu3vH6dckQZdkfRTVFBjSLQejxGQfCxe5oIk_9r3_LJ4P5g4DoI1D_pijyb4jvHDaMnTXwgzn1NZS5JWFNfAuAr2pd/w349-h400/menu_cropped.png" width="349" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Generalised trimming is accessed from the basedata menu</td></tr></tbody></table><br /><div><br /><div><br /></div><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjedRJPfLfoJeNJc_3kpxY1TA4et5qdQIlpN53zM9_2cdxZvqqK7QMnHO2lbfZypIjXC1vJh2TeJMV58NQ93Euz3VAamZMPPKez7E9v4xu7O9fMr1M2hoCdAMPuromk7JJ0LpR3l9_7gT_xHyjQK5kN8iPZWJCRyVVPgfXg8964w0DQIoYJKUB5nXBqeIPI/s436/select.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="201" data-original-width="436" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjedRJPfLfoJeNJc_3kpxY1TA4et5qdQIlpN53zM9_2cdxZvqqK7QMnHO2lbfZypIjXC1vJh2TeJMV58NQ93Euz3VAamZMPPKez7E9v4xu7O9fMr1M2hoCdAMPuromk7JJ0LpR3l9_7gT_xHyjQK5kN8iPZWJCRyVVPgfXg8964w0DQIoYJKUB5nXBqeIPI/s320/select.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">It has the usual interface where one can specify a new name. "Trimming a clone" ensures it operates on a copy. "Delete matching" allows one to invert the trim, i.e. if one wants to keep only the labels that do not match,</td></tr></tbody></table><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLHr2xubETD7WvZ7yVpyRLBNdPw5A0egyAb6FcjSuO5lE2X5pxxDm-D4sk1CO78WzftSPFdPA4MRJIdPALJ31TJzjSbvJejkyNTHBsJ2e2fu8Kf6Vxbq5KYf8d0KvW0mP5tnj8qLFQREShhIiAdXTGF-a9RSmfx8gw0505RIxXJdCRbhgC6E8_CJQPLSY5/s276/select_source.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="276" data-original-width="232" height="276" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLHr2xubETD7WvZ7yVpyRLBNdPw5A0egyAb6FcjSuO5lE2X5pxxDm-D4sk1CO78WzftSPFdPA4MRJIdPALJ31TJzjSbvJejkyNTHBsJ2e2fu8Kf6Vxbq5KYf8d0KvW0mP5tnj8qLFQREShhIiAdXTGF-a9RSmfx8gw0505RIxXJdCRbhgC6E8_CJQPLSY5/s1600/select_source.png" width="232" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Any of the basedatas, trees or matrices in the project can be selected to use as the label source. </td></tr></tbody></table><br /><div><br /></div><div><div><div>----</div><div><p>Shawn Laffan</p><p>03-Feb-2024</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div></div><div><br /></div></div></div></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-30370741931844873782023-12-02T14:30:00.001+11:002024-02-03T16:19:41.266+11:00Biodiverse now calculates the CANAPE super classSince version 4.3, Biodiverse has <a href="https://biodiverse-analysis-software.blogspot.com/2022/10/biodiverse-now-calculates-canape-for-you.html" target="_blank">calculated and plotted the CANAPE results</a> when the relevant calculations have been run. <div><br /></div><div>However, it did not calculate the super class when first implemented. Now it does.</div><div><br /></div><div>From version 5, Biodiverse calculates all CANAPE classes when a randomisation is run for an analysis that includes phylogenetic endemism and relative phylogenetic endemism. </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyqgOdbBjka1AlY-kakIVJel21gW_u_fdXinnK-DWVSE_6PXPUSlMyQrIdp7ReB5oUaZUX-CFGry_T3uEVQhwViEszY6zaGATQdgSbH2JrlA1N5AZMkpWBIVMCJzZB6X08CMRNrLhjurbOGM9VMIttfkJXhJhtzUW5hE14PbnVf6Vxjepxen0STRMNZnTY/s1252/canape_super_class.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="790" data-original-width="1252" height="405" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyqgOdbBjka1AlY-kakIVJel21gW_u_fdXinnK-DWVSE_6PXPUSlMyQrIdp7ReB5oUaZUX-CFGry_T3uEVQhwViEszY6zaGATQdgSbH2JrlA1N5AZMkpWBIVMCJzZB6X08CMRNrLhjurbOGM9VMIttfkJXhJhtzUW5hE14PbnVf6Vxjepxen0STRMNZnTY/w640-h405/canape_super_class.png" width="640" /></a></div><br /><div>Note that the CANAPE classed are only updated after at least one randomisation iteration has been run. If you have an existing randomisation then you can run one more iteration to trigger the calculation. Otherwise you can run a new randomisation with the same settings. This should not take long for most analyses, assuming they are consistent with the sizes of data sets in <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList" target="_blank">existing publications</a>. </div><div><br /></div><div>If you are wondering why it was not plotted in the first place, it was largely because the plotting system needed some re-engineering to allow for additional legend labels. This was done when the <a href="https://biodiverse-analysis-software.blogspot.com/2023/02/plotting-z-score-indices-and.html" target="_blank">z-score</a> and <a href="https://biodiverse-analysis-software.blogspot.com/2023/04/changes-to-randomisation-results-p-rank.html" target="_blank">p-rank</a> plotting was implemented, a little while after the initial CANAPE plotting. </div><div><br /></div><div><br /></div><div><div>----</div><div><p>Shawn Laffan</p><p>02-Dec-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div></div><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-68817185611148776382023-05-01T12:38:00.003+10:002023-05-01T12:38:11.915+10:00Biodiverse 4.3 has been released<p> </p><h4>Biodiverse version 4.3 has now been released. </h4><p>Versions for Windows, Mac and Linux (Ubuntu) are available and can be accessed via <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads">https://github.com/shawnlaffan/biodiverse/wiki/Downloads</a></p><p><br /></p><p>Installation instructions are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/Installation">https://github.com/shawnlaffan/biodiverse/wiki/Installation</a></p><p><br /></p><p>This release contains a small number of bug fixes and improved functionality. </p><div><div>For the full list of issues and changes leading to the 4.3 release, see <a href="https://github.com/shawnlaffan/biodiverse/milestone/21" target="_blank">https://github.com/shawnlaffan/biodiverse/milestone/21</a></div><div><br /></div><div>Main changes:</div><div><br /></div><div>GUI:</div></div><blockquote style="border: none; margin: 0 0 0 40px; padding: 0px;"><div><div style="text-align: left;">z-score plotting has been fixed (colours were reversed). <a href="https://github.com/shawnlaffan/biodiverse/issues/857" target="_blank">Issue 857</a>.</div></div></blockquote><div><div>Randomisations</div></div><blockquote style="border: none; margin: 0 0 0 40px; padding: 0px;"><div><div style="text-align: left;">The p-rank calculations now generate ranks for all defined values. The GUI also now colours the values, similar to the z-scores. <a href="https://github.com/shawnlaffan/biodiverse/issues/856" target="_blank">Issue 856</a>. <a href="https://biodiverse-analysis-software.blogspot.com/2023/04/changes-to-randomisation-results-p-rank.html" target="_blank">More details in the blog post</a>.</div></div></blockquote><div><div>Spatial conditions</div></div><blockquote style="border: none; margin: 0 0 0 40px; padding: 0px;"><div><div style="text-align: left;">The sp_points_in_same_poly_shape condition is now faster when any points do not intersect any polygons. See commit <a href="https://github.com/shawnlaffan/biodiverse/commit/3ca2703a9943bcc2ed5ff331cca6b1f6ea447fbe" rel="" target="_blank">3ca2703</a>.</div></div></blockquote><div><br /></div><div><br /></div><div><br /></div><div>----</div><div><div><p>Shawn Laffan</p><p>01-May-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-33449904211121422222023-04-27T13:35:00.000+10:002023-04-27T13:35:38.389+10:00Changes to randomisation results - the p-rank data <p> </p><p>Randomisations in Biodiverse produce a range of outputs. These are kept in a range of lists, differing by name (<a href="https://github.com/shawnlaffan/biodiverse/wiki/AnalysisTypes#where-do-the-randomisation-results-go-and-what-do-they-mean" target="_blank">see the help system</a>). </p><p>One of the lists that is generated in the p-ranks. This is essentially the same as the P_ values in the main randomisation lists but where the low values account for ties so one can be sure the values represent the relative ranking of the observed value against those generated from the randomised data. For example, the significance of a low value should account for any ties.</p><p>The p-ranks were implemented a few years versions ago and are <a href="https://biodiverse-analysis-software.blogspot.com/2016/08/easier-to-use-randomisation-results.html" target="_blank">detailed in this blog post</a>. Due to how the plotting was set up at the time, only values in the outer 10% of the distribution were retained. This helped understand which groups contained significant results without a major update to the display system but in the end was probably confusing. Now that the <a href="https://biodiverse-analysis-software.blogspot.com/2023/02/plotting-z-score-indices-and.html" target="_blank">z-score plotting </a>has been implemented the system has the infrastructure to handle the full range of values. </p><h3 style="text-align: left;">So what has changed? </h3><p>Two things: the calculation of values and how they are plotted. </p><p>Note that the set of cells that can be regarded as significant using the standard alpha threshold of 0.05 for high or low values is unchanged. All that has changed is the number of cells with defined values and how they are displayed in the GUI. </p><h3 style="text-align: left;">The calculation </h3><p>Put simply, all values are now retained. Any "P_" value less than 0.5 accounts to the number of ties. Expressed as pseudocode it is:</p><p><span style="font-family: courier;">if P_index > 0.5</span></p><p><span style="font-family: courier;"> p_rank = P_index </span></p><p><span style="font-family: courier;">else </span></p><p><span style="font-family: courier;"> p_rank = ((C_index + T_index) / Q_index) </span></p><p>where "index" is whichever <a href="https://biodiverse-analysis-software.blogspot.com/2023/02/plotting-z-score-indices-and.html" target="_blank">index </a>is being compared at the time. </p><p>This makes post-hoc calculation of compound indices like CANAPE easier (although remember that Biodiverse <a href="https://biodiverse-analysis-software.blogspot.com/2022/10/biodiverse-now-calculates-canape-for-you.html" target="_blank">now does that for you</a>). </p><h3 style="text-align: left;">The display</h3><div>The addition of the z-score plotting means that the infrastructure for the plotting is in place so it was not too difficult to re-use it to instead display percentile classes. This is applied to the p-score lists by default. </div><div><br /></div><div>Compare the two plots below and consider which is easier to work with. </div><div><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbcJd9WJ07McDyS-9R9l2nu_CVgxPj1w0MIzG2E9HX2NYukHuDyECEkaHjT9D3I0HtpuScXzW6Udg8ntAT3IN-CI73tszLvLrshXIDQv7HQmo-Kd7gkfS9OckBQ_4eJreJpBIjShQhign2KoANbweMoz86vnKpG8vKpXXJdXhwGVU6eZQMFbw0BuF3Kw/s1306/image2.png" style="margin-left: auto; margin-right: auto; text-align: center;"><img border="0" data-original-height="776" data-original-width="1306" height="380" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbcJd9WJ07McDyS-9R9l2nu_CVgxPj1w0MIzG2E9HX2NYukHuDyECEkaHjT9D3I0HtpuScXzW6Udg8ntAT3IN-CI73tszLvLrshXIDQv7HQmo-Kd7gkfS9OckBQ_4eJreJpBIjShQhign2KoANbweMoz86vnKpG8vKpXXJdXhwGVU6eZQMFbw0BuF3Kw/w640-h380/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The p-rank plotting in Biodiverse version 4.2 and earlier works, but it is difficult to see which cells are in specific percentile bands. For example which of these cells is in the outer 5%? </td></tr></tbody></table><div><br /></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjY-G67l1nTl88MKmHH20CvAnJP3wxh5pwazodap6AdNyS9N_ciRCy4P4IxNSNjrII6k-ranygu4ha_5XcsR48Td0iC0Bx-0y5IfcRjWH6LTSj32OKx-1zji4NXI0Tvfo3ZASN0wuZSS4wYv_ATUTAuAheze2FytjDsBPnSCVmf4gP43cWMM7hLfUKXzw/s1000/image3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="601" data-original-width="1000" height="384" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjY-G67l1nTl88MKmHH20CvAnJP3wxh5pwazodap6AdNyS9N_ciRCy4P4IxNSNjrII6k-ranygu4ha_5XcsR48Td0iC0Bx-0y5IfcRjWH6LTSj32OKx-1zji4NXI0Tvfo3ZASN0wuZSS4wYv_ATUTAuAheze2FytjDsBPnSCVmf4gP43cWMM7hLfUKXzw/w640-h384/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Indices in the p-rank lists are now plotted as percentile classes. Compare with the plot above. </td></tr></tbody></table><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div>As with other plots, the coloured cells can be <a href="https://biodiverse-analysis-software.blogspot.com/2019/05/reproduce-spatial-plots-with-same.html" target="_blank">exported as RGB geotiffs </a>to display in a GIS or other plotting system. <br /><p><br /></p><p>----</p><p>Shawn Laffan</p><p>27-Apr-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/" target="_blank">http://shawnlaffan.github.io/biodiverse/</a> </p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList" target="_blank">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users" target="_blank">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions" target="_blank">https://github.com/shawnlaffan/biodiverse/discussions</a> </p><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-32158432248834976422023-04-02T21:17:00.000+10:002023-04-02T21:17:11.035+10:00Publications using Biodiverse in 2022<p>2023 is now in full swing, so here is a list of publications from 2022 that used Biodiverse.</p><p>If you want to see the full list (183 at the time of writing), then go to <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList" target="_blank">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a></p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/" target="_blank">http://shawnlaffan.github.io/biodiverse/</a></p><p><br /></p><ul>
<li>
<p>Amaral, D.T., Bonatelli, I.A.S., Romeiro-Brito, M., Moraes, E.M. and
Franco, F.F. (2022) Spatial patterns of evolutionary diversity in
Cactaceae show low ecological representation within protected areas. <a href="https://doi.org/10.1016/j.biocon.2022.109677" rel="nofollow">Biological Conservation, 273, 109677</a>.</p>
</li>
<li>
<p>Ávila-González, H., González-Gallegos, J.G., Munguía-Lino, G. &
Castro-Castro, A. (2022) The genus Sisyrinchium (Iridaceae) in Sierra
Madre Occidental, Mexico: A new species, richness and distribution. <a href="https://doi.org/10.1600/036364422X16512564801641" rel="nofollow">Systematic Botany, 47, 319-334</a>.</p>
</li>
<li>
<p>Carter, B. E., Misiewicz, T. M. & Mishler, B. D. (2022). Spatial
phylogenetic patterns in the North American moss flora are shaped by
history and climate. <a href="https://doi.org/10.1111/jbi.14385" rel="nofollow">Journal of Biogeography, 49, 1327-1338</a>.</p>
</li>
<li>
<p>Chen, K., Khine, P.K., Yang, Z. and Schneider, H. (2022) Historical
plant records enlighten the conservation efforts of ferns and
Lycophytes’ diversity in tropical China, <a href="https://doi.org/10.1016/j.jnc.2022.126197" rel="nofollow">Journal for Nature Conservation, 68, 126197</a>.</p>
</li>
<li>
<p>Contreras-Medina, R., García-Martínez, A. I., Ramírez-Martínez, J.
C., Espinosa, D., Balam-Narváez, R., and Luna-Vega, I. (2021).
Biogeographic analysis of ferns and lycophytes in Oaxaca: A Mexican
beta-diverse area. <a href="https://doi.org/10.17129/botsci.2844" rel="nofollow">Botanical Sciences, 100, 204-222</a>.</p>
</li>
<li>
<p>Fernandes, N.B.G., de Menezes Yazbeck, G. & Milward-de-Azevedo,
M.A. (2022) Taxonomic diversity of Passifloraceae sensu stricto along
altitudinal gradient and on Serra dos Órgãos mountain slopes in
southeastern Brazil. <a href="https://doi.org/10.1590/2175-7860202273057" rel="nofollow">Rodriguésia, 73, e00702021</a>.</p>
</li>
<li>
<p>Gosper C.R., Percy-Bower J.M., Byrne M., Llorens T.M. & Yates
C.J. (2022) Distribution, Biogeography and Characteristics of the
Threatened and Data-Deficient Flora in the Southwest Australian
Floristic Region. <a href="https://doi.org/10.3390/d14060493" rel="nofollow">Diversity, 14, 493</a>.</p>
</li>
<li>
<p>Griffiths, D. (2022). Do the drivers and levels of isolation in fish
faunas differ across Atlantic and Pacific drainages in the Americas? <a href="https://doi.org/10.1111/jbi.14358" rel="nofollow">Journal of Biogeography, 49, 930-941</a>.</p>
</li>
<li>
<p>Gutiérrez-Rodríguez, B.E., Guevara, R., Angulo, D.F. et al. (2022)
Ecological niches, endemism and conservation of the species in <em>Selenicereus</em> (Hylocereeae, Cactaceae). <a href="https://doi.org/10.1007/s40415-022-00818-z" rel="nofollow">Brazilian Journal of Botany, 45, pages 1149–1160</a>.</p>
</li>
<li>
<p>Gutiérrez–Rodríguez, B.E., Vásquez–Cruz, M. and Sosa, V. (2022)
Phylogenetic endemism of the orchids of Megamexico reveals complementary
areas for conservation. <a href="https://doi.org/10.1016/j.pld.2022.03.004" rel="nofollow">Plant Diversity, 44, 351-359</a>.</p>
</li>
<li>
<p>Kong, H., Condamine, F.L., Yang, L., Harris, A.J., Feng, C., Wen, F.
and Kang, M. (in press) Phylogenomic and macroevolutionary evidence for
an explosive radiation of a plant genus in the Miocene. <a href="https://doi.org/10.1093/sysbio/syab068" rel="nofollow">Systematic Biology, 71, 589–609</a>.</p>
</li>
<li>
<p>Moreira-Muñoz, A, Palchetti, V.A., Morales-Fierro, V., Duval, V.S.,
Allesch-Villalobos, R., & González-Orozco, C.E. (2022) Diversity and
Conservation Gap Analysis of the Solanaceae of Southern South America.
<a href="https://doi.org/10.3389/fpls.2022.854372" rel="nofollow">Frontiers in Plant Science, 13</a>.</p>
</li>
<li>
<p>Murillo-Pérez, G., Rodríguez, A., Sánchez-Carbajal, D., Ruiz-Sanchez,
E., Carrillo-Reyes, P., Munguía-Lino, G. (2022) Spatial distribution of
species richness and endemism of Solanum (Solanaceae) in Mexico. <a href="https://doi.org/10.11646/phytotaxa.558.2.1" rel="nofollow">Phytotaxa 558, 147–177</a></p>
</li>
<li>
<p>Olivares-Juárez, M.I., Burgos-Hernández, M. and Santiago-Alvarádo, M.
(2022) Patterns of Species Richness and Distribution of the Genus <em>Laelia</em> s.l. vs. <em>Laelia</em> s.s. (Laeliinae: Epidendroideae: Orchidaceae) in Mexico: Taxonomic Contribution and Conservation Implications. <a href="https://doi.org/10.3390/plants11202742" rel="nofollow">Plants, 11:2742</a>.</p>
</li>
<li>
<p>Paz, A., Silva, A.S. & Carnaval, A. (2022) A framework for
near-real time monitoring of diversity patterns based on indirect remote
sensing, with an application in the Brazilian Atlantic rainforest. <a href="https://doi.org/10.7717/peerj.13534" rel="nofollow">PeerJ, 10:e13534</a>.</p>
</li>
<li>
<p>Rivera-Martínez, R., Ramírez-Morillo, I.M., De-Nova, José A.,
Carnevali, G., Pinzón, J.P., Romero-Soler, K.J. & Raigoza, N. (2022)
Spatial phylogenetics in Hechtioideae (Bromeliaceae) reveals recent
diversification and dispersal. <a href="https://doi.org/10.17129/botsci.2975" rel="nofollow">Botanical Sciences, 100, 692-709</a>.</p>
</li>
<li>
<p>Silva, D.C., Oliveira, H.F.M., Zangrandi, P.L. and Domingos, F.M.C.B.
(2022) Flying Over Amazonian Waters: The Role of Rivers on the
Distribution and Endemism Patterns of Neotropical Bats. <a href="https://doi.org/10.3389/fevo.2022.774083" rel="nofollow">Frontiers in Ecology and Evolution, 10:774083</a>.</p>
</li>
<li>
<p>Wang, Q., Huang, J., Zang, R., Li, Z. and El-Kassaby, Y. A. (2022).
Centres of neo- and paleo-endemism for Chinese woody flora and their
environmental features. <a href="https://doi.org/10.1016/j.biocon.2022.109817" rel="nofollow">Biological Conservation, 276, 109817</a>.</p>
</li>
<li>
<p>Yang, X., Qin, F., Xue, T., Xia, C., Gadagkar, S. R., & Yu, S.
(2022). Insights into plant biodiversity conservation in large river
valleys in China: A spatial analysis of species and phylogenetic
diversity. <a href="https://doi.org/10.1002/ece3.8940" rel="nofollow">Ecology and Evolution, 12, e8940</a>.</p>
</li>
<li>
<p>Yang, X., Zhang, W., Qin, F., et al. (2022). Biodiversity priority areas and conservation strategies for seed plants in China. <a href="https://doi.org/10.3389/fpls.2022.962609" rel="nofollow">Frontiers in Plant Science, 13, 962609</a>.</p>
</li>
<li>
<p>Zhang, W., Bussmann, R.W., Li, J., Liu, B., Xue, T., Yang, X., Qin,
F., Liu, H. and Yu, S. (2022) Biodiversity hotspots and conservation
efficiency of a large drainage basin: Distribution patterns of species
richness and conservation gaps analysis in the Yangtze River Basin,
China. <a href="https://doi.org/10.1111/csp2.12653" rel="nofollow">Conservation Science and Practice, 4, e12653</a>.</p>
</li>
<li>
<p>Zhang, Y., Qian, L., Chen, X., Sun, L., Sun, H. and Chen, J. (2022)
Diversity patterns of cushion plants on the Qinghai-Tibet Plateau: a
basic study for future conservation efforts on alpine ecosystems. <a href="https://doi.org/10.1016/j.pld.2021.09.001" rel="nofollow">Plant Diversity, 44, 231-242</a>.</p>
</li>
<li>
<p>Zhang, X.X, Ye, J.F., Laffan, S.W., Mishler, B.D., Thornhill, A.H.,
Lu, L.M. et al. (2022) Spatial phylogenetics of the Chinese angiosperm
flora provides insights into endemism and conservation. <a href="https://doi.org/10.1111/jipb.13189" rel="nofollow">Journal of Integrative Plant Biology, 64, 105-117</a>.</p>
</li>
<li>
<p>Zhao, R., Xu, S., Song, P., Zhou, X, Zhang, Y. and Yuan, Y. (2022)
Distribution patterns of medicinal plant diversity and their
conservation in the Qinghai-Tibet Plateau. <a href="https://doi.org/10.17520/biods.2021385" rel="nofollow">Biodiversity Science, 30, 21385</a>.</p>
</li></ul><p><br /></p><p>Shawn Laffan</p><p>02-Apr-2023</p><p><br /></p><p><br /></p><p> </p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-44062354568595536032023-03-29T12:06:00.003+11:002023-03-29T12:06:53.231+11:00Biodiverse version 4.2 has been released<p> </p><h4>Biodiverse version 4.2 has now been released. </h4><p>Versions for Windows, Mac and Linux (Ubuntu) are available and can be accessed via <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads">https://github.com/shawnlaffan/biodiverse/wiki/Downloads</a></p><p><br /></p><p>Installation instructions are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/Installation">https://github.com/shawnlaffan/biodiverse/wiki/Installation</a></p><p><br /></p><p>This release contains a small number of bug fixes and improved
functionality. For the full list of issues and changes leading to the
4.2 release, see <a href="https://github.com/shawnlaffan/biodiverse/milestone/20">https://github.com/shawnlaffan/biodiverse/milestone/20</a></p><p>Main changes:</p><p>
</p><ul>
<li>GUI
<ul>
<li>Branch highlighting in the View Labels tab works again. This was broken in version 4.1. <a href="https://github.com/shawnlaffan/biodiverse/issues/850">Issue #850</a>.</li>
</ul>
</li>
<li>Data imports
<ul>
<li>Raster imports now include the band labels if defined in multiband files. <a href="https://github.com/shawnlaffan/biodiverse/issues/852">Issue #852</a>.</li>
<li>Importing a raster now works when the nodata value is NaN. <a href="https://github.com/shawnlaffan/biodiverse/issues/851">Issue #851</a>.</li>
</ul>
</li></ul><div><br /></div><div><br /></div><div>----</div><div><div><p>Shawn Laffan</p><p>29-Mar-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-29371843365069706422023-02-07T15:26:00.000+11:002023-02-07T15:26:00.968+11:00Biodiverse version 4.1 has been released<p> </p><h4>We are pleased to announce the release of Biodiverse version 4.1. </h4><p>Versions for Windows, Mac and Linux (Ubuntu) are available and can be accessed via <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads">https://github.com/shawnlaffan/biodiverse/wiki/Downloads</a></p><p><br /></p><p>Installation instructions are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/Installation">https://github.com/shawnlaffan/biodiverse/wiki/Installation</a></p><p><br /></p><p>Version 4.1 represents five issues closed across 96 source code commits.</p><p>Highlights of the changes since version 4.0 are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-41">https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-41</a>, and the related blog posts can be accessed via <a href="https://biodiverse-analysis-software.blogspot.com/search/label/Version41">https://biodiverse-analysis-software.blogspot.com/search/label/Version41</a></p><p>A more detailed listing of the closed issues is at <a href="https://github.com/shawnlaffan/biodiverse/milestone/19?closed=1">https://github.com/shawnlaffan/biodiverse/milestone/19?closed=1</a></p><div><br /></div><div>The main user visible change is that z-score indices are now plotted using a divergent colour scale using z-score significance thresholds. More details are in <a href="https://biodiverse-analysis-software.blogspot.com/2023/02/plotting-z-score-indices-and.html" target="_blank">this blog post</a>. </div><div><br /></div><div><br /></div><div>----</div><div><div><p>Shawn Laffan</p><p>07-Feb-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-29955077681112337972023-02-07T15:23:00.003+11:002023-04-02T18:16:56.446+10:00Plotting z-score indices and randomisation results<p>From version 4.1, Biodiverse will plot indices it knows are z-scores using a divergent colour scheme, with values classified into intervals (adapted from the <a href="https://desktop.arcgis.com/en/arcmap/latest/tools/spatial-statistics-toolbox/what-is-a-z-score-what-is-a-p-value.htm" target="_blank">ArcGIS implementation</a>). This makes it much easier to see which locations are potentially significant given the expected values.</p><p>This process applies to indices like the <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#phylogenetic-and-nearest-taxon-distances-unweighted" target="_blank">Net Relatedness Index and Net Taxon Index</a>, all of the Gi* indices such as for <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#group-property-gi-statistics" target="_blank">group properties</a> and <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#label-property-gi-statistics" target="_blank">label properties</a> (<a href="https://biodiverse-analysis-software.blogspot.com/2018/08/analysing-trait-data.html" target="_blank">more on such analyses here</a>), as well as the <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/randomisations-now-also-generate-z.html" target="_blank">z-scores generated by randomisation analyses</a>. It also applies to branches of a cluster dendrogram when indices have been calculated for each node/branch. </p><p>You can <a href="https://biodiverse-analysis-software.blogspot.com/2019/05/reproduce-spatial-plots-with-same.html" target="_blank">export the coloured images to geotiff</a> in the same way as for any data set.</p><p>There is not much more to it than that, so here are some images of what it looks like for a spatial analysis using the Acacia data set of <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. (2014)</a>. </p><p><br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFIZfLhcspABLuGSbVJ8oK9w1x6d2sO7mEaDF32L3Z1uCmvkjpxQa0gwLKQ557n7fi1IbAis6DwS1jsh5lZ6bweZ31b7N2thEPWLmppBH1qKxPaKlWx91uI1kLY8DFnsE9216UrFWODl-ll_AGsf_f24HaHsEiKGyB13_QdObYue6mQRG8JEzEklDk7w/s1132/image1.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="632" data-original-width="1132" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFIZfLhcspABLuGSbVJ8oK9w1x6d2sO7mEaDF32L3Z1uCmvkjpxQa0gwLKQ557n7fi1IbAis6DwS1jsh5lZ6bweZ31b7N2thEPWLmppBH1qKxPaKlWx91uI1kLY8DFnsE9216UrFWODl-ll_AGsf_f24HaHsEiKGyB13_QdObYue6mQRG8JEzEklDk7w/w640-h358/image1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Net Relatedness Index</td></tr></tbody></table><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><br /><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQlKYXXfvnDUJf8lDMf20G2keCTmg2zg3XV6MbmyH662HHZ_072hWRhVfKWEwX-slLehgPuoKyIJYBWeL9wS1N3NvYMeOQ5dONVYQ-F8mvQwkJLR8-MtL5pX34Q7WNwAenXfNVgoJ11YKFnY84CJSgWRHG_hmWRh7Cai1jzXT2d45YWUZv9nFc85Bb3g/s1132/image2.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="632" data-original-width="1132" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQlKYXXfvnDUJf8lDMf20G2keCTmg2zg3XV6MbmyH662HHZ_072hWRhVfKWEwX-slLehgPuoKyIJYBWeL9wS1N3NvYMeOQ5dONVYQ-F8mvQwkJLR8-MtL5pX34Q7WNwAenXfNVgoJ11YKFnY84CJSgWRHG_hmWRh7Cai1jzXT2d45YWUZv9nFc85Bb3g/w640-h358/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Z-scores for Phylogenetic Diversity after a spatial randomisation process</td></tr></tbody></table><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlBO_nqSJ0DMvlTV_d6FWYN4MJp3ffd9At9I0lD73Bv9XiYJPzxhFsE4MR78NkCUDH8aLbI9MqCONJXXwqc-geSs3Yzl0LpSInknMMpt9EO0aLPVHoimCzntJuTm3luXrZoMZzqbKtzEx_Iiv1h0h3_RIyx5vW7KKKk98nlb5Zxe8Bl-PUzDmC25E1xg/s1132/image3.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="632" data-original-width="1132" height="358" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlBO_nqSJ0DMvlTV_d6FWYN4MJp3ffd9At9I0lD73Bv9XiYJPzxhFsE4MR78NkCUDH8aLbI9MqCONJXXwqc-geSs3Yzl0LpSInknMMpt9EO0aLPVHoimCzntJuTm3luXrZoMZzqbKtzEx_Iiv1h0h3_RIyx5vW7KKKk98nlb5Zxe8Bl-PUzDmC25E1xg/w640-h358/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Net Relatedness Index calculated for the groups (cells) under each branch of a cluster analysis. Coloured cells are associated with the dendrogram branches that intersect the blue slider bar.</td></tr></tbody></table><br /><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJF6ddsiP3FuprMzuKLQMGTzHbnOh9bzW3k6dGjAtT6IclM6ywNVTifJG81ua2xXHq_BMTyWlwN_2fKACxfz7dQsl0xesi6hiRuuYzFfLZBzQrHlZYHcdVmK70We37lDsq6_tou9ZkECbvER493Wqbd8pijv0UN3QcDektAE-d-21RwK9c4TkA7sTYSA/s1430/image4.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="745" data-original-width="1430" height="334" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJF6ddsiP3FuprMzuKLQMGTzHbnOh9bzW3k6dGjAtT6IclM6ywNVTifJG81ua2xXHq_BMTyWlwN_2fKACxfz7dQsl0xesi6hiRuuYzFfLZBzQrHlZYHcdVmK70We37lDsq6_tou9ZkECbvER493Wqbd8pijv0UN3QcDektAE-d-21RwK9c4TkA7sTYSA/w640-h334/image4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The spatial distribution of PD significance (left) with branches occurring in a cell in south-west Western Australia (black dot) coloured by clade score significance against the same randomisation process.</td></tr></tbody></table><br /><p><br /></p><div>----</div><div><div><p>Shawn Laffan</p><p>07-Feb-2023</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p></div><p><br /></p><div class="separator" style="clear: both; text-align: center;"><br /></div><br />Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-80338201386691869202022-11-26T09:24:00.002+11:002022-11-26T09:24:40.708+11:00Biodiverse version 4.0 has been released<h4 style="text-align: left;">We are pleased to announce the release of Biodiverse version 4.0. </h4><p>Versions for Windows, Mac and Linux (Ubuntu) are available and can be accessed via <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads">https://github.com/shawnlaffan/biodiverse/wiki/Downloads</a></p><p><br /></p><p>Installation instructions are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/Installation">https://github.com/shawnlaffan/biodiverse/wiki/Installation</a></p><p><br /></p><p>Version 4.0 represents 52 issues closed across 752 source code commits. 260 files have been changed.</p><p>Highlights of the changes since version 3.1 are at <a href="https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-40">https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-40</a>, and the related blog posts can be accessed via <a href="https://biodiverse-analysis-software.blogspot.com/search/label/Version4">https://biodiverse-analysis-software.blogspot.com/search/label/Version4</a></p><p>A more detailed listing of the closed issues is at <a href="https://github.com/shawnlaffan/biodiverse/milestone/17?closed=1">https://github.com/shawnlaffan/biodiverse/milestone/17?closed=1</a></p><div><br /></div><p><br /></p><div>-----</div><div><br /></div><div><div><p>Shawn Laffan</p><p>26-Nov-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>For a list of some of the analyses Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p><p><br /></p></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-26295083298315603752022-11-25T15:19:00.001+11:002022-11-25T15:20:06.545+11:00Export cluster groups to shapefile<p>Biodiverse Version 4 allows users to export their cluster analyses using the same grouping process as is used to colour the branches. </p><p>This can be convenient to reconstruct the clusters in a GIS or other graphics system. </p><p>One issue is that only the cluster polygons (or points) are exported. If you want to attached data from the clusters then you can export them to delimited text using the Table Grouped method (with the same grouping parameters) and use a database join to attach them to the shapefile. The main reason for this is that shapefiles have a limit of 11 characters for field names, and many indices in Biodiverse exceed this (as well as sometimes containing characters other than letters, numbers and the underscore). </p><p><span style="text-align: center;">Another point to be aware of is that each group (cell) is a separate polygon so use a dissolve to merge them if you want to remove the internal boundaries.</span></p><p><span style="text-align: center;"><br /></span></p><p>Pictures are better than words so here are some screenshots. </p><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibOBpO2OtCXKBTXN_faxKh9txANz-s-Tg-N9tOSqoqEqEaX1xi_Fx6NJ3QobCJoGRk8pfHK1xp4ttKNCOl-MQLCJuOPt5Yhvpllv8lmtXDMzHZi6uDFTeJkHKbIjkHnfjdlx-l20v2bZCjDYQ1H5qDp1-OWdX_AWr5M-uHZvkS3iCKFk1RTtjof0_S7Q/s1002/image1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="632" data-original-width="1002" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibOBpO2OtCXKBTXN_faxKh9txANz-s-Tg-N9tOSqoqEqEaX1xi_Fx6NJ3QobCJoGRk8pfHK1xp4ttKNCOl-MQLCJuOPt5Yhvpllv8lmtXDMzHZi6uDFTeJkHKbIjkHnfjdlx-l20v2bZCjDYQ1H5qDp1-OWdX_AWr5M-uHZvkS3iCKFk1RTtjof0_S7Q/w640-h404/image1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">An example cluster analysis, in this case with six clusters coloured. </td></tr></tbody></table><br /><p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVD4Ez9jKRk7zOZD2oMHT60ym-dFjJecpuCaMyFwfbA9cVIg8w6wkdTOpGhpi2XyC7V8hmwtG9tDhHeG5MdLRhNb7SUlxBz01T4kVIUlw54rFc1Y6L6s1RCg1eLeYDEJ6yeQcB-ESrCxDlhedPvH6prelKYpEGKUUYH7V2W_S4XR16HSFtJ-7axhBwJQ/s1002/image2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="632" data-original-width="1002" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVD4Ez9jKRk7zOZD2oMHT60ym-dFjJecpuCaMyFwfbA9cVIg8w6wkdTOpGhpi2XyC7V8hmwtG9tDhHeG5MdLRhNb7SUlxBz01T4kVIUlw54rFc1Y6L6s1RCg1eLeYDEJ6yeQcB-ESrCxDlhedPvH6prelKYpEGKUUYH7V2W_S4XR16HSFtJ-7axhBwJQ/w640-h404/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The export option is in the usual place. It can also be accessed through the outputs tab. </td></tr></tbody></table><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwv_uDiH5Qgd-lgoJcuBcQDqmebmBt2UlI-2R3y-Fh27YjXbQS6BQZy3XapenWQpqTp23SHyy1LyzHjt-iSZ9zt8XroswztzmJtSXW-gntIBATBw_Gb1ke3LfcbOpljnFPHp8sfjIWW6bhwW0EpOu7jlK83cGhPcRSjoVcRm2Q6e_IH2vK8kPrzYDG2A/s768/image3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="313" data-original-width="768" height="261" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwv_uDiH5Qgd-lgoJcuBcQDqmebmBt2UlI-2R3y-Fh27YjXbQS6BQZy3XapenWQpqTp23SHyy1LyzHjt-iSZ9zt8XroswztzmJtSXW-gntIBATBw_Gb1ke3LfcbOpljnFPHp8sfjIWW6bhwW0EpOu7jlK83cGhPcRSjoVcRm2Q6e_IH2vK8kPrzYDG2A/w640-h261/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">In this case the export is set to use six clusters to match the display, but you can choose whatever you like. Other options include selecting by depth or by distance from the root (by length or depth). </td></tr></tbody></table><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuMwkRfo7lQ3GWFzs9vh07gnAfBZuDg4nVqzW7a1QaseRJD81YgMgDBgCPK4yn97lo-ugqkOntARFuWtpQB1eiWBSt7x00VCXSHoHlXJSOzXG6YjObTFp-X5RlRdywfC5d3vJKiZQA0wWBdz3SDBK6vp4kY5AyymeIh2IWetdF-XoGpVULTauG-1cIiQ/s607/image4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="607" data-original-width="360" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuMwkRfo7lQ3GWFzs9vh07gnAfBZuDg4nVqzW7a1QaseRJD81YgMgDBgCPK4yn97lo-ugqkOntARFuWtpQB1eiWBSt7x00VCXSHoHlXJSOzXG6YjObTFp-X5RlRdywfC5d3vJKiZQA0wWBdz3SDBK6vp4kY5AyymeIh2IWetdF-XoGpVULTauG-1cIiQ/w380-h640/image4.png" width="380" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">And here we have a plot of the clusters. The colours differ but the clusters themselves are the same (and one can always update the colours). </td></tr></tbody></table><br /><div style="text-align: left;"><span style="text-decoration-line: line-through;"><br /></span></div><div style="text-align: left;"><span style="text-decoration-line: line-through;"><br /></span></div><div style="text-align: left;">If you want to use the grouped clusters in a spatial condition then it is easier to do so directly - <a href="https://biodiverse-analysis-software.blogspot.com/2022/05/use-clusters-in-spatial-conditions.html" target="_blank">see more details here</a>. </div><div style="text-align: left;"><br /></div><div style="text-align: left;">If you just want to replicate the display then it is better to export the spatial data to an RGB geotiff and the tree to nexus with the colours embedded - <a href="https://biodiverse-analysis-software.blogspot.com/2018/08/cluster-analyses-export-coloured.html" target="_blank">see geotiff details here</a> and the <a href="https://biodiverse-analysis-software.blogspot.com/2017/04/biodiverse-now-exports-tree-branch.html" target="_blank">tree details here</a>. </div></div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: left;"><br /></div></div><p></p><p><span style="text-align: center;"><br /> </span>--------</p><div><br /></div><div><p>Shawn Laffan</p><p>25-Nov-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-88455516497962922052022-11-25T14:20:00.001+11:002022-11-25T14:20:56.097+11:00Trees: Merge single-child branches with their children<p>When Biodiverse is used to trim a tree to a subset of branches, for example to match the selected BaseData object, any branch with no remaining descendants is removed from the tree. All other branches are retained. </p><p>What this means is that some internal branches (nodes) can be left with only one child branch (node),. These can be referred to as single-child nodes and also knuckles. Retaining such nodes can be useful if some of the structure of the original tree needs to be kept, for example to indicate that there is phylogenetic data but that it has been removed from the tree. The counter to this is that most phylogenetic trees are samples and so are likely to be missing many branches anyway.</p><p>In the spirit of letting the user decide, Biodiverse version 4 supports the merger of internal branches with their children if they have only one child. </p><p>Names are important, and like many systems any node can be named in Biodiverse. In fact, all nodes have names but internal nodes default to a number with three trailing underscores (so "1___", "35___" etc). This allows many of the branch and clade level indices such as the <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#pe-clade-contributions" target="_blank">phylogenetic endemism clade contributions</a> and <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#pd-clade-loss" target="_blank">PD clade loss</a>. </p><p>The general rule when merging is that the name of the merged node is whichever node had a non-default name to begin with. If both have non-default names then a child that is a terminal wins. Otherwise the parent name is used. </p><p>The process is best demonstrated using images. </p><p><br /></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><img data-original-height="593" data-original-width="940" height="405" src="https://blogger.googleusercontent.com/img/a/AVvXsEjREPU5BK0ONuM4pA8wTLoR7BhhPVwOClR8MvW0-hGGolHo5rl9LSmjuS5KZ3jKrZmnVsnY8xKTvSJo8_ai2DKbYChw46p32DR1GarKLvXdLB6C8A23bI5OFGDMiRz2f3Na_BAMcdMyXxYkzJPXegu6Gn60uWmeCJy_UwiT5_0e0wukFtRfW7ZnfxANmA=w640-h405" style="margin-left: auto; margin-right: auto;" width="640" /></td></tr><tr><td class="tr-caption" style="text-align: center;">An example tree plotted using depth instead of length to show the individual branches. The black branches are not in the basedata. <br /><br /></td></tr></tbody></table><p></p><div class="separator" style="clear: both; text-align: center;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgZRTkLKwXGAGlGy69uV7cSoyH25ui-ck3vSn2Dgrah-bHMD_X9-CbC9epLzzjnvAoaei58C9FiPRIGGzhKmk_SSbFzqUdVgi1BBB8kKXFbfUGcy2VduX2C9QzAdfDF72HceORIV23F-M2ymaCNMgEYaif5z74uL1l6ptiWGm5ztddCF7VBpFMeHgXQ0g" style="margin-left: auto; margin-right: auto;"><img data-original-height="216" data-original-width="350" height="123" src="https://blogger.googleusercontent.com/img/a/AVvXsEgZRTkLKwXGAGlGy69uV7cSoyH25ui-ck3vSn2Dgrah-bHMD_X9-CbC9epLzzjnvAoaei58C9FiPRIGGzhKmk_SSbFzqUdVgi1BBB8kKXFbfUGcy2VduX2C9QzAdfDF72HceORIV23F-M2ymaCNMgEYaif5z74uL1l6ptiWGm5ztddCF7VBpFMeHgXQ0g=w200-h123" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The tree trimming interface includes the option to merge single child nodes. In this case it is not selected. </td></tr></tbody></table></div><p></p><p><br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgisazXiJjMnwCGUsn8XSv1dBz-rfXQ8JaS3OyHBESRwI8go-5lU36JerD7VkJjgMNEycShI4tCzzsImLfPC6dnY2_qroBwcx-I9E_1m2sC3KPyfq_VCMiL89M-TQmn0wLYiRmFMOltrtaIzahsO713TdiL6ZF6d7A1i6OrbS1AFNDhHMqthzy_i2BSXw" style="margin-left: auto; margin-right: auto;"><img data-original-height="593" data-original-width="940" height="405" src="https://blogger.googleusercontent.com/img/a/AVvXsEgisazXiJjMnwCGUsn8XSv1dBz-rfXQ8JaS3OyHBESRwI8go-5lU36JerD7VkJjgMNEycShI4tCzzsImLfPC6dnY2_qroBwcx-I9E_1m2sC3KPyfq_VCMiL89M-TQmn0wLYiRmFMOltrtaIzahsO713TdiL6ZF6d7A1i6OrbS1AFNDhHMqthzy_i2BSXw=w640-h405" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The black branches from the previous screenshot have been deleted but one can see several branches that appear twice as long as the others. These are actually pairs of branches.</td></tr></tbody></table><br /><br /><p></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhBcwmDqwnRlucljFKZ7R7dyNQ8v5VCDbvoFQqiUsdHK9tWkzXCtSEJcWts_Rl6jSJQ5yl5DOjFLig_1tX4yo8rHNwVsISMqGEdmVfaYyYFrGXP2MNE_vcbwyEtjasagci9bcl4GC0HJUzcmOy5izh7pGEjejKKuEtHsmIwmP8mr_KVyd_w7inpdI59Bg" style="margin-left: auto; margin-right: auto;"><img data-original-height="216" data-original-width="350" height="123" src="https://blogger.googleusercontent.com/img/a/AVvXsEhBcwmDqwnRlucljFKZ7R7dyNQ8v5VCDbvoFQqiUsdHK9tWkzXCtSEJcWts_Rl6jSJQ5yl5DOjFLig_1tX4yo8rHNwVsISMqGEdmVfaYyYFrGXP2MNE_vcbwyEtjasagci9bcl4GC0HJUzcmOy5izh7pGEjejKKuEtHsmIwmP8mr_KVyd_w7inpdI59Bg=w200-h123" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Repeating the process above but this time merging the single child (knuckle) nodes. </td></tr></tbody></table><br /><br /><p></p><p><br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhWPgi9YUPoDspDEb7jCtP2GPZhY9HjGwEbj8n6kQ8zQCBH_m5SuwoYPQ6TuIjI46CT0UKOSecYdUtOPkUp27iswNmxPND28t-m9wyVAuDoWyMzymK5FU4ekTr0CPRgaeYUgHi-L8Ve4vfoeli46M3NjvraWp7sTMPSeh8s_l75YGE3k6mggxtr22Utfw" style="margin-left: auto; margin-right: auto;"><img data-original-height="593" data-original-width="940" height="404" src="https://blogger.googleusercontent.com/img/a/AVvXsEhWPgi9YUPoDspDEb7jCtP2GPZhY9HjGwEbj8n6kQ8zQCBH_m5SuwoYPQ6TuIjI46CT0UKOSecYdUtOPkUp27iswNmxPND28t-m9wyVAuDoWyMzymK5FU4ekTr0CPRgaeYUgHi-L8Ve4vfoeli46M3NjvraWp7sTMPSeh8s_l75YGE3k6mggxtr22Utfw=w640-h404" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">In this case all the branches are the same length because all single child branches have been merged with their children. </td></tr></tbody></table><br /><br /><p></p><p>The examples above all use the tree trimming process, but if you have a tree that already has knuckles or forget to merge them then you can also merge the nodes directly from the tree menu. </p><p><br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhd6pZHdUqIQN5xEdJOLlvALsgMjWe5wcxsppo5cR6wAxuihFRnJzSwJIhqjv4767Nr27HoWNAjdbDiLeD6rcCRlqD2MLZ_jBkBoHsx_7nsD_zGvi4S8PR6Pp0gOAweihHsO9YvnOW0csUI_CuHMq_I2Mb_bo9EXhxLyVM9EMH-RDwJYvPSYEVVplTNZQ" style="margin-left: auto; margin-right: auto;"><img data-original-height="593" data-original-width="940" height="404" src="https://blogger.googleusercontent.com/img/a/AVvXsEhd6pZHdUqIQN5xEdJOLlvALsgMjWe5wcxsppo5cR6wAxuihFRnJzSwJIhqjv4767Nr27HoWNAjdbDiLeD6rcCRlqD2MLZ_jBkBoHsx_7nsD_zGvi4S8PR6Pp0gOAweihHsO9YvnOW0csUI_CuHMq_I2Mb_bo9EXhxLyVM9EMH-RDwJYvPSYEVVplTNZQ=w640-h404" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Direct access to the merging process. </td></tr></tbody></table><br /><p></p><div>--------</div><div><br /></div><div><p>Shawn Laffan</p><p>25-Nov-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p></div><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-6107830665459495812022-10-25T16:41:00.000+11:002022-10-25T16:41:08.342+11:00Biodiverse now calculates CANAPE for you<p>The CANAPE protocol is one of the analyses Biodiverse is most commonly used for (see examples amongst the list of <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList" target="_blank">publications using Biodiverse</a>). </p><p>The method, or protocol, was originally described in <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. (2014)</a> and is conceptually simple. Run an analysis that includes phylogenetic endemism and relative phylogenetic endemism, run those through a randomisation, and then categorise the results based on the significance score of the indices. This process is described in more detail in previous posts <a href="https://biodiverse-analysis-software.blogspot.com/2014/11/canape-categorical-analysis-of-palaeo.html" target="_blank">here</a> and <a href="https://biodiverse-analysis-software.blogspot.com/2014/11/do-it-yourself-canape.html" target="_blank">here</a>. </p><p>The main issue with the approach to date is that the CANAPE classes are determined outside of Biodiverse using systems like a GIS, R code or a spreadsheet. So while the process is conceptually simple, the actual implementation can all get a bit complex. Many users are not entirely sure which indices to pass through their functions, or even which lists to extract them from. </p><p>As of Version 4 Biodiverse now calculates it for you. This occurs automatically whenever an analysis has included the <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#phylogenetic-endemism" target="_blank">Phylogenetic Endemism</a> and <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#relative-phylogenetic-endemism-type-2" target="_blank">Relative Phylogenetic Endemism type 2</a> calculations. (If you want it sooner than version 4 then it is in the development release 3.99_005, which was current at the time of writing. <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads" target="_blank">See the downloads page for links</a>).</p><p><br /></p><p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjserlI2QgablidfgtITTSepMMwRMZmizs9Ad8_3UtRaIF-ABtAyzBUiuNLo7j5Imkeq70lg-h4_USpAL02pzSvqqCAoidEHvfM9sxBf0gawDc004x5RtAfmKOzydu9Er83V8E92_pSajtRWdVOSjtMJLn0LOaNLfDJXun_Led7h9NH9FlY7U_qQ1oOSw" style="margin-left: auto; margin-right: auto;"><img data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/a/AVvXsEjserlI2QgablidfgtITTSepMMwRMZmizs9Ad8_3UtRaIF-ABtAyzBUiuNLo7j5Imkeq70lg-h4_USpAL02pzSvqqCAoidEHvfM9sxBf0gawDc004x5RtAfmKOzydu9Er83V8E92_pSajtRWdVOSjtMJLn0LOaNLfDJXun_Led7h9NH9FlY7U_qQ1oOSw=w640-h344" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Biodiverse now calculates the CANAPE scores when the requisite indices have been calculated, and a randomisation has been run. Like many of the posts on this blog, this example uses the Acacia data set from <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. (2014)</a>.</td></tr></tbody></table><br /></p><p></p><h3 style="text-align: left;">How does Biodiverse store the results? </h3><p></p><p>The results are stored in a new list where the name is the randomisation output used followed by "<span style="font-family: courier;">>>CANAPE>></span>". So for a randomisation called "<span style="font-family: courier;">rand</span>" you would see "<span style="font-family: courier;">rand>>CANAPE>></span>". The use of angle brackets might look a bit strange at first but makes the naming consistent with the other randomisation lists and simplifies the underlying code.</p><p>The CANAPE classes are stored in an index called CANAPE_CODE, with a numeric code indicating which of the categories a cell falls in. Currently this code is 0 for not significant, 1 for neo-endemism, 2 for palaeo-endemism and 3 for mixed endemism.</p><p>Biodiverse also provides individual indices for neo, palaeo and mixed in the event a user only wants to see which cells are are in a specific class. For example one might want to run a cluster analysis using only neo-endemism cells following the process described <a href="https://biodiverse-analysis-software.blogspot.com/2016/04/more-canape-how-to-restrict-your.html" target="_blank">here</a>. </p><p> </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhVung7uLdSHJ-_smi3ZmPirC8VQ5cvu_qUyr0PBPuSmOwKP9pzj7JAqNbpyFjdOMCvSVOlHKM3g_65CEdQdZd1qM6Bti8Guda44hchqmQqgpB5raMPDV8HQ2CTAv_pGsWL_lueRUu_Ii2qzLedqppbxTXWKOqfLYqEcgUGP6ibq1EEOFbYHHvBHS60ww" style="margin-left: auto; margin-right: auto;"><img data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/a/AVvXsEhVung7uLdSHJ-_smi3ZmPirC8VQ5cvu_qUyr0PBPuSmOwKP9pzj7JAqNbpyFjdOMCvSVOlHKM3g_65CEdQdZd1qM6Bti8Guda44hchqmQqgpB5raMPDV8HQ2CTAv_pGsWL_lueRUu_Ii2qzLedqppbxTXWKOqfLYqEcgUGP6ibq1EEOFbYHHvBHS60ww=w640-h344" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The same data as above but highlighting Palaeo-endemism cells in red. All other cells containing data are in blue. </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><p style="text-align: left;"><br /></p></div><p></p><h3 style="text-align: left;">Visualisation</h3><p>A big advantage of generating CANAPE results within Biodiverse is that users can now explore the results using the functionality Biodiverse provides. As an example, the next screen shot shows an exploration of the contribution of each clade on the tree in relation to the analysis groups (cells) (see more details about that process <a href="https://biodiverse-analysis-software.blogspot.com/2017/09/visualise-spatial-analysis-results-on.html" target="_blank">here</a> and <a href="http://biodiverse-analysis-software.blogspot.com.au/2016/01/more-on-tree-visualisations-in.html" target="_blank">here</a>). </p><p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjcji7MqHb8zgFRzs-czzLreSi_z7r7rSngR_3tRUiNtGMTSNsbTO3mOY6cQSm2ngk957NvZPr3c2ieCHMdagC_-bSZB736efYgJUPCmTGY5iFbYXtxfqAiBoaiJXv-NasYHr1jw2KOz3ImOoGKUCufpUQfWEMGcA3ix8RjyBLuEXR23lpucsipIQOzDA" style="margin-left: auto; margin-right: auto;"><img data-original-height="504" data-original-width="940" height="344" src="https://blogger.googleusercontent.com/img/a/AVvXsEjcji7MqHb8zgFRzs-czzLreSi_z7r7rSngR_3tRUiNtGMTSNsbTO3mOY6cQSm2ngk957NvZPr3c2ieCHMdagC_-bSZB736efYgJUPCmTGY5iFbYXtxfqAiBoaiJXv-NasYHr1jw2KOz3ImOoGKUCufpUQfWEMGcA3ix8RjyBLuEXR23lpucsipIQOzDA=w640-h344" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Each tree branch is coloured by the relative contribution of the clade subtending it to the PE score in the cell being hovered over (black dot in south-western WA). This allows an understanding of which clade is driving the PE scores, and thus CANAPE, in a cell. The visualisation process is explained in more detail <a href="https://biodiverse-analysis-software.blogspot.com/2017/09/visualise-spatial-analysis-results-on.html" target="_blank">here</a>. </td></tr></tbody></table></p><h4 style="text-align: left;"><br /></h4><p></p><div class="separator" style="clear: both; text-align: center;"><h3 style="text-align: left;">Displaying the results in other systems</h3><p style="text-align: left;">If you then want to use the plots as part of a map then they can be exported to an RGB Geotiff. <a href="https://biodiverse-analysis-software.blogspot.com/2019/05/reproduce-spatial-plots-with-same.html" target="_blank">Details of how to do this are in another post</a> but the next two screenshots show the start and end. </p><p><br style="text-align: left;" /></p></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEg2HuRdCZY4exQiiNhrSb_XcLnFWfA9q0yfcSOdxrpZKKzXS8JK4DrNiMYpeivt_ksTWP5triJQ7hjO_A7bso6gFWSg2IBBUyWjTDtGtxTAGKpQlF3plfhg7ChmJNMXYd-0hOBmpWQo8sViMMw06C3zLR0FjAPWCXKyWydAvLmijCkTgaqrkLVYLhcSkw" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/a/AVvXsEg2HuRdCZY4exQiiNhrSb_XcLnFWfA9q0yfcSOdxrpZKKzXS8JK4DrNiMYpeivt_ksTWP5triJQ7hjO_A7bso6gFWSg2IBBUyWjTDtGtxTAGKpQlF3plfhg7ChmJNMXYd-0hOBmpWQo8sViMMw06C3zLR0FjAPWCXKyWydAvLmijCkTgaqrkLVYLhcSkw=w640-h344" width="640" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEiwqVh0Y2QJQoOdPy9zkt2cYCsn5MNaDZDlb8W29194s_AIrrGI50IcokWf7KfqkRWkkenz9OtGBObIAN_mDz2Y4yeNzcejK67ORCCrf3T3_GhkvPjW2cRtF1MHezI8wJofjfdW0kdh4dvPkczuSKt9L-LdFQJHibPT2iiAyNcYXIx2HYWQDo9U96THIw" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="608" data-original-width="908" height="428" src="https://blogger.googleusercontent.com/img/a/AVvXsEiwqVh0Y2QJQoOdPy9zkt2cYCsn5MNaDZDlb8W29194s_AIrrGI50IcokWf7KfqkRWkkenz9OtGBObIAN_mDz2Y4yeNzcejK67ORCCrf3T3_GhkvPjW2cRtF1MHezI8wJofjfdW0kdh4dvPkczuSKt9L-LdFQJHibPT2iiAyNcYXIx2HYWQDo9U96THIw=w640-h428" width="640" /></a></div><br /><br /></div><h3 style="text-align: left;">What about a different colour scheme?</h3><div><br /></div>The colour scheme used is from Mishler et al. (2014) where neo is red (new is hot), palaeo is blue (old is cold) and purple is between blue and red on a colour wheel. <div><br /></div><div>If you prefer a different colour scheme then you can export the data as you normally would, for example as CSV files or as non-RGB geotiffs, and recreate the plot to your own tastes. </div><div><br /></div><div>Changing the colours within Biodiverse would be very useful and contributions are always welcome.<br /><br /><p></p><h3 style="text-align: left;">What about the Super class? </h3><div><br /></div><div>The system does not currently generate the Super class. It can be added if there is demand. </div><div><br /></div><h4 style="text-align: left;">Do I have to run a new randomisation analysis to see the CANAPE list? </h4><div><br /></div><div>The CANAPE lists are generated at the end of any sequence of randomisations. If you already have a randomisation analysis then they can be created by running one additional iteration. </div><div><br /></div><div>If you are concerned that your analysis is already at 999 iterations then all you lose is a bit of numeric neatness as there are now 1001 realisations in total instead of 1000 (one original plus all the random ones). This is unlikely to make any meaningful difference once that many iterations have been run.</div><div><br /></div><div>--------</div><div><br /></div><div><p>Shawn Laffan</p><p>25-Oct-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p><div><br /></div></div></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-30037557701784759922022-07-10T21:06:00.000+10:002022-07-10T21:06:33.522+10:00Biodiverse now calculates indices for the variation in phylogenetic distinctness<p>Biodiverse has included calculations of indices from the phylocom system for several versions, specifically the Mean Phylogenetic Distance (MPD) and Mean Nearest Taxon Distance (MNTD). The MPD is the average of the pair-wise distances between tree tips in a sample, where the distances pass through all the shared ancestors below the most recent common ancestor. The MNTD is the average distance for each tip to its nearest tip in the sample.</p><p>There are many ways of slicing and diving a sample, and one of the development principles of Biodiverse is to provide more details rather than less. Consequently there are also indices for the pair-wise root mean standard deviation (RMSD), minimum and maximum distances between a sample of tips on a tree. </p><p>The min and max are simply the longest and shortest distances in the pairwise sample, so the distances between the most and least related pairs. The RMSD is the square root of the mean squared distance and is a measure of the variability in a sample. It is analogous to a standard deviation but where the expected value (the mean) is zero, and follows the same formulation as the Root Mean Squared Error except a value of zero in RMSE means no error whereas in RMSD it means a zero distance between tips on the tree. </p><p>However, the RMSD is not the variance and sometimes one is looking to see how a set of pair-wise distances is distributed around the mean. This is where the Variance becomes useful, as first described by <a href="https://dx.doi.org/10.3354/meps216265" target="_blank">Warwick and Clarke (2001)</a>.</p><p>Biodiverse version 4 includes indices for the variance of the pairwise distances. The index names are subject to change before then but for now follow the pattern PMPD1_VARIANCE, PMPD2_VARIANCE and PMPD2_VARIANCE, where the 1, 2 and 3 indicate unweighted (each tip counts equally), locally range weighted (tips count as many groups they occur in the neighbourhood) and locally abundance weighted (using the number of samples of each tip in the neighbourhood). These are calculated by default when the relevant MPS and MNTD indices are requested. </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_MPoWxseNEqxFkfPe8pp9k_66PrrX-Nj8_wvhOWxFebmPDvmYdIaXREQTmgPhZiASmSl0zxAdd-2mCctAa3OsTqbKtZwip1LPo5YjfSa5630cTppTAEP3YjyUV17JpN4P-hCQ3pSyRkrLWfwgQ_FwQU2Mw7ugRvvXUE7ssibxJ2vAwxisaXqxvVm8cg/s1920/image1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_MPoWxseNEqxFkfPe8pp9k_66PrrX-Nj8_wvhOWxFebmPDvmYdIaXREQTmgPhZiASmSl0zxAdd-2mCctAa3OsTqbKtZwip1LPo5YjfSa5630cTppTAEP3YjyUV17JpN4P-hCQ3pSyRkrLWfwgQ_FwQU2Mw7ugRvvXUE7ssibxJ2vAwxisaXqxvVm8cg/w640-h344/image1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The variance indices are calculated with the other MPD and MNTD indices. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-lYdsZmToeZ4ajCHSai7cvyJ62Xlm53m8vt39IBSBkboIhbc5xUrs6GY3EIMSe_mVQmN-xQ28p_p19lkNTBrsW0Hy3JCk5EUghnuE_4h3oQj60pr53BD744B-5KnMJTTc6fiQnXxdy7oWydkxE8lMMSO7vR-mSNm2qq3VyUgzFlnfm18w26eUr7_Gsw/s1920/image2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-lYdsZmToeZ4ajCHSai7cvyJ62Xlm53m8vt39IBSBkboIhbc5xUrs6GY3EIMSe_mVQmN-xQ28p_p19lkNTBrsW0Hy3JCk5EUghnuE_4h3oQj60pr53BD744B-5KnMJTTc6fiQnXxdy7oWydkxE8lMMSO7vR-mSNm2qq3VyUgzFlnfm18w26eUr7_Gsw/w640-h344/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Plotting is the same as for any index. Some cells are blank because values are undefined when the sample contains only one tip, and therefore no path between tips. Zero variances are where there are only two tips, and thus no variation. </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLciNWAg8wFz85mJKCq0YLUHg7sy4j10H1DtMf5DKXaNT5JW_6SpETUxn7l3grUqlDRKPSxUR5_NPC8Gd5C-jpMmeOkAELj6vruCEf2p3_SZi9u7_Q_b2gAam8dIhgCgsOefA1TTGch7d_s_v4wPusrr0h1dy4zKWVwW7HQhTd3p5ddxl-EpZYfVqI5Q/s1920/image3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLciNWAg8wFz85mJKCq0YLUHg7sy4j10H1DtMf5DKXaNT5JW_6SpETUxn7l3grUqlDRKPSxUR5_NPC8Gd5C-jpMmeOkAELj6vruCEf2p3_SZi9u7_Q_b2gAam8dIhgCgsOefA1TTGch7d_s_v4wPusrr0h1dy4zKWVwW7HQhTd3p5ddxl-EpZYfVqI5Q/w640-h344/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">This is just a plot of the mean for comparison. </td></tr></tbody></table><br /><p><br /></p><h3 style="text-align: left;">But are the values significant?</h3><p>A common approach to testing significance of the MPD and MNTD indices in the unweighted case is to use a resampling approach. For each sample this generates a distribution of possible values under random resampling of the same number of tips. More details are given in <a href="https://biodiverse-analysis-software.blogspot.com/2021/09/faster-calculation-of-phylocom-indices.html" target="_blank">another blog post</a>. </p><p>The unweighted pairwise variance is also assessed in this way, with the index name using NET_VPD. As with NRI and NTI, this is a z-score so values more extreme than +/-1.96 can be considered significantly higher or lower than expected. </p><p>The resampling approach uses the same code as for NRI and NTI so the same sequence of resamples can be used across NRI, NTI and NET_VPD, although in Biodiverse version 4 this is only for NTI for non-ultrametric trees an exact calculation is used for NRI with any trees and for NTI for ultrametric trees. This exact calculation avoids resampling and is much faster to run. More details and references are in the <a href="https://biodiverse-analysis-software.blogspot.com/2021/09/faster-calculation-of-phylocom-indices.html" target="_blank">same blog post referred to above</a>).</p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyglscobn_yhXcf2pNG2CgVuYcs9IMmTFIHaVVaomJXHP1PdIgMuYgOJmiSyuvMEny2bssBacs_dCIyb34jrLryuHzOf_t1koO83GYXCuM3hdL2pCSwbczrz-dTn806hlnWoYnpaUZ8fikrz_JbZm3qIiq28ZbE_INbbd8h0-_yNQoLHMlUV0E4rqJSA/s1920/image4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyglscobn_yhXcf2pNG2CgVuYcs9IMmTFIHaVVaomJXHP1PdIgMuYgOJmiSyuvMEny2bssBacs_dCIyb34jrLryuHzOf_t1koO83GYXCuM3hdL2pCSwbczrz-dTn806hlnWoYnpaUZ8fikrz_JbZm3qIiq28ZbE_INbbd8h0-_yNQoLHMlUV0E4rqJSA/w640-h344/image4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The NET_VPD indices are also under the PhyloCom set. Users can calculate the NET_VPD as well as the expected values used in its calculation. </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFhbovHs93e5jkifY2S3gjq4HgVLTTKJRt14NRElGuxJus0XI3Z551vyYnlPeshrvSQtjuSSjHrBHS-TB4t2QdEqDcHLPFjB152oEoW6MqrpYRc4Ey9cAkeCQmzCQsQK-HiYdSFRtPdZBteioWm2-y5pEkYMGBUgz6fWWwzZmJEaX7peFt4EHT8IdaXQ/s1920/image5.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1920" height="344" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFhbovHs93e5jkifY2S3gjq4HgVLTTKJRt14NRElGuxJus0XI3Z551vyYnlPeshrvSQtjuSSjHrBHS-TB4t2QdEqDcHLPFjB152oEoW6MqrpYRc4Ey9cAkeCQmzCQsQK-HiYdSFRtPdZBteioWm2-y5pEkYMGBUgz6fWWwzZmJEaX7peFt4EHT8IdaXQ/w640-h344/image5.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Values are z-scores. At least three tips are needed to calculate the z-score as standard deviations are always zero for two tips and thus the z-score is undefined.<br /></td></tr></tbody></table><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLoc4M-S2D6h89HuHMOt5-v_etM11vVto4Yp68mRpKVDvWTR8RaiFxBZilCoKqKYeiKvRbXb2bv7tfG8-ruYB0VC2iLtx0jCtcLyXSsCNYL1JsZhJ3StvV9zdGhRHvt3MMgUqRQdCcPG3XIT4juAN6g88-PBFYfEx6t2dkSYIyvwlrqStAeHccyxQLAQ/s657/image6.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="657" data-original-width="488" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLoc4M-S2D6h89HuHMOt5-v_etM11vVto4Yp68mRpKVDvWTR8RaiFxBZilCoKqKYeiKvRbXb2bv7tfG8-ruYB0VC2iLtx0jCtcLyXSsCNYL1JsZhJ3StvV9zdGhRHvt3MMgUqRQdCcPG3XIT4juAN6g88-PBFYfEx6t2dkSYIyvwlrqStAeHccyxQLAQ/w298-h400/image6.png" width="298" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Control clicking on cells allow users to see the values for all indices that were calculated (within each output list, where SPATIAL_RESULTS is where most go). </td></tr></tbody></table><p><br /></p><p><br /></p><p>Shawn Laffan</p><p>10-Jul-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> or start a discussion at <a href="https://github.com/shawnlaffan/biodiverse/discussions">https://github.com/shawnlaffan/biodiverse/discussions</a> </p><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-47630401701416613392022-05-02T17:53:00.000+10:002022-05-02T17:53:23.417+10:00Use clusters in spatial conditions<p>Spatial conditions are a core part of Biodiverse</p><p>Most people seem to focus on using single cells for their analysis and trying to find the ideal cell size. This is missing much of the benefit of spatial analyses. You are not constrained to using single cells in isolation. </p><p>You can analyse regions around each focal location (processing group) using geometric shapes like circles. Varying the size of the window gives an understanding of the spatial scale of the patterns (the operational scale). However, there is no need to be geometric - you can use arbitrarily complex spatial conditions based on polygon features, proximity and/or matching text. See for example <a href="https://doi.org/10.1046/j.1365-2699.2003.00875.x" target="_blank">Laffan and Crisp (2003)</a> and <a href="https://doi.org/10.1016/j.scitotenv.2015.04.113" target="_blank">Laity et al. (2015)</a>. </p><p>You can also use cluster (and region grower) analyses to define your spatial windows. These allow you to let the data define the regions, with the calculations then applied giving you more understanding of the groupings that have been identified. Care needs to be taken with interpretation due to the risk of circularity, but that's not unusual. And sometimes you just want to understand something about the assemblage that falls under a node (branch). You might also be interested in <a href="https://biodiverse-analysis-software.blogspot.com/2022/05/importing-group-properties-directly.html" target="_blank">the environmental properties associated with a cluster</a>.</p><p>One issue with the cluster approach is that it can be difficult to use the branches in a spatial condition for a different analysis. Consider the case where one wants to <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/spatially-partition-your-randomisations.html" target="_blank">spatially partition a randomisation </a>so labels are kept within their associated clusters (for a given cluster cutoff). You could <a href="https://biodiverse-analysis-software.blogspot.com/2019/08/export-cluster-analyses-to-shapefiles.html" target="_blank">export the clusters to shapefile format</a>, extract the relevant features to a new shapefile, and then <a href="https://github.com/shawnlaffan/biodiverse/wiki/SpatialConditions#sp_points_in_same_poly_shape" target="_blank">use that in a new spatial condition</a>. But that's a lot of work and not easy for people less familiar with geoprocessing and GIS. </p><p>From version 4 you can access the set of groups under a cluster analysis and use that to define spatial conditions (actually it is in the 3.99_003 development version). This can use any of the current cutting methods, so you can slice by distance from the tips, depth, or number of clusters from the root using the <span style="font-family: courier;">sp_points_in_same_cluster</span> condition. You can also select individual branches (nodes) by name (<span style="font-family: courier;">sp_point_in_cluster</span>). </p><p>Some snippets are below that can be copied into your spatial conditions windows. No screenshots this time, but I can add a new post of that is needed. </p><p>Note that the cluster analysis being referred to must be in the same basedata. </p><p><br /></p>## <span style="font-family: monospace;">sp_points_in_same_cluster examples<br />
<br />
</span><span style="font-family: monospace;"># Try to use the highest four
clusters from the root.<br />
# Note that the next highest number will be used<br />
# if four is not possible, e.g. there might be five<br />
# siblings below the root. Fewer will be returned<br />
# if the tree has insufficient tips.<br />
sp_points_in_same_cluster (<br />
output => "some_cluster_output",<br />
num_clusters => 4,<br />
)<br />
<br />
# Cut the tree at a distance of 0.25 from the tips<br />
sp_points_in_same_cluster (<br />
output => "some_cluster_output",<br />
target_distance => 0.25,<br />
)<br />
<br />
# Cut the tree at a depth of 3 from the root.<br />
# The root is depth 1.<br />
sp_points_in_same_cluster (<br />
output => "some_cluster_output",<br />
target_distance => 3,<br />
group_by_depth => 1,<br />
)<br />
<br />
# Select four clusters below a specified node <br />
sp_points_in_same_cluster (<br />
output => "some_cluster_output",<br />
num_clusters => 4,<br />
from_node => '118___', # use the node's name<br />
)<br />
<br />
# target_distance is ignored if num_clusters is set</span><div><span style="font-family: monospace;"># so this is the same as the first example<br />sp_points_in_same_cluster (<br />
output => "some_cluster_output",<br />
num_clusters => 4,<br />
target_distance => 0.25,<br />
)<br />
</span><br />
<br />
## <span style="font-family: monospace;">sp_point_in_cluster examples<br />
</span><br />
<span style="font-family: monospace;"># This will select any element that is a
terminal in the cluster output<br />
# It is useful when the cluster analysis was run under<br />
# a definition query to reduce the number of elements clustered,</span></div><div><span style="font-family: monospace;"># and you want the same set of elements.<br />
sp_point_in_cluster (<br />
output => "some_cluster_output",<br />
)<br />
<br />
# Now specify a cluster within the output<br />
sp_point_in_cluster (<br />
output => "some_cluster_output",<br />
from_node => '118___', # use the node's name<br />
)<br />
<br />
# Specify an element to check instead of the current<br />
# processing element.<br />
sp_point_in_cluster (<br />
output => "some_cluster_output",<br />
from_node => '118___', # use the node's name<br />
element => '123:456', # specify an element to check<br />
)<br /></span><p> </p><p><br /></p><p>Shawn Laffan</p><p>02-May-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> </p><div><br /></div><p> </p></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-24978759902834132132022-05-02T13:29:00.002+10:002022-05-02T14:37:02.413+10:00Importing group properties directly from rasters <h2 style="text-align: left;">What environmental conditions relate to my biodiversity patterns? </h2><p>Often one wants to understand which environmental conditions are associated with the taxonomic, phylogenetic and/or trait data. Examples include edaphic and climatic variables, and publications doing so include <a href="https://doi.org/10.1111/j.1466-8238.2006.00250.x" target="_blank">Bickford and Laffan (2006)</a>, <a href="https://doi.org/10.1111/jbi.12153" target="_blank">Gonzales-Orozco et al. (2013)</a>, <a href="https://doi.org/10.1111/ddi.12129" target="_blank">González-Orozco et al. (2014a)</a>, <a href="https://doi.org/10.1371/journal.pone.0092558" target="_blank">González-Orozco et al. (2014a)</a>, <a href="http://journal.frontiersin.org/article/10.3389/fgene.2015.00132/" target="_blank">Nagalingum et al. (2015)</a> and <a href="https://doi.org/10.11646/zootaxa.4802.1.4" target="_blank">Bein et al. (2020)</a>.</p><p>Such data are typically obtained as rasters, with spatial resolutions often of the order of hundreds of metres. This is in contrast to the resolution typically used for Biodiverse analyses (tens to hundreds of kilometres).</p><p>Up until now this has been something of a complex process. The raster data need to be aggregated to the same resolution as the Biodiverse data, and aligned as part of that process. Some sort of summary statistic needs to be calculated for each cell, usually the mean. Then the data need to be converted to a CSV format with coordinates that exactly match the Basedata group labels so they can be attached as group properties using the import process. The latter can be done by importing the rasters as their own basedatas, running <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#numeric-label-statistics" target="_blank">numeric label statistics</a>, exporting the results to CSV format and then attaching from there. Still not simple, and not easy when there are tens of rasters to process. </p><h2 style="text-align: left;">Now it is much easier</h2><p>This process is greatly simplified in Biodiverse version 4, with early access via the 3.99_003 development release. (Access to releases is via the <a href="https://github.com/shawnlaffan/biodiverse/wiki/Downloads" target="_blank">downloads page</a>). </p><p>A set of rasters can be selected, imported and attached. Biodiverse takes care of all the spatial matching and runs the summary statistics. As a bonus, the imported data can also be attached to the project in the event the user wants to run other analyses on them.</p><p>Currently there is support for the mean, standard deviation, min, max etc. If there is demand for other statistics like the median or inter-quartile range then these can be added.</p><p>Any raster data supported by GDAL can be imported. Development has used geotiffs as they are the most common. The process could probably also be generalised to support other file formats like CSV and shapefile. It depends on demand and developer time. </p><p>The key criteria for the raster data are that they must be in the <a href="https://github.com/shawnlaffan/biodiverse/wiki/FAQ#what-coordinate-system-are-my-data-in" target="_blank">same coordinate system as your basedata</a> and they must represent continuous data (i.e. not be numerical categories). The latter point is important because the group property analyses do not work with nominal/categorical values. If you need to summarise categorical data then use an indicator approach where each class is represented by its own raster, and that raster has values of 1 for where that class occurs, and zero elsewhere.</p><h2 style="text-align: left;">How it works</h2><p>Some screenshots are probably the best means of showing the process. </p><p>In these examples I import two data sets from <a href="https://www.worldclim.org/" target="_blank">WorldClim</a> at a 5 arc minute resolution, the Annual Mean Temperature and Mean Diurnal Range. These are just the first two <a href="https://www.worldclim.org/data/bioclim.html" target="_blank">of the Bioclim layers </a>provided by WorldClim. The data have been projected into a Lambert Conic Conformal coordinate system to match the basedata being used (the example data that come with Biodiverse) and have been cropped to the Australian extent.</p><p> </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiM0_DJ5q7zveTtKUQPkwolmRZD2xv7RhgsTRpk3gscFDgLQkKtVY8h7JaAWet-vwDWTbvwBC9rYL_oWWeDJ8-cQb7ihwDimj36XK0ZBtOZ8EjqOWl_-lp71FtY1IdCPg6Vdp6SXvkwljqQpFtvbqEBRFzyFURyAtU7YQIMu2v5ZwjWArsQrGAhVfg_LA/s822/image1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="612" data-original-width="822" height="475" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiM0_DJ5q7zveTtKUQPkwolmRZD2xv7RhgsTRpk3gscFDgLQkKtVY8h7JaAWet-vwDWTbvwBC9rYL_oWWeDJ8-cQb7ihwDimj36XK0ZBtOZ8EjqOWl_-lp71FtY1IdCPg6Vdp6SXvkwljqQpFtvbqEBRFzyFURyAtU7YQIMu2v5ZwjWArsQrGAhVfg_LA/w640-h475/image1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Annual rainfall from WorldClim2 for Australia, using a Lambert Conic Conformal projection. Brown is low, blue is high.</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3VWO6vxAMyedcgRzS1fWNWtFoieHvZomBvy-UO1XlYe2aOKfpKKqFOn_JNtLzss1crp6aZkIu_7AVfEg5jxum9HZ2Q1A4afd6Kgme4VKBTFgcGyvO7ejEdN_mb9sVePSTSadMeDvMJIk4z_KhDblghaH1BwFPQ9k2tOAtyo6lAxuTI-WvHfCai094Sw/s1252/image2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="790" data-original-width="1252" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3VWO6vxAMyedcgRzS1fWNWtFoieHvZomBvy-UO1XlYe2aOKfpKKqFOn_JNtLzss1crp6aZkIu_7AVfEg5jxum9HZ2Q1A4afd6Kgme4VKBTFgcGyvO7ejEdN_mb9sVePSTSadMeDvMJIk4z_KhDblghaH1BwFPQ9k2tOAtyo6lAxuTI-WvHfCai094Sw/w640-h404/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The data are going to be attached to the example data that come with Biodiverse.</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsE7A7DZA22LPqdZegJrmA-urYyOLX8vxY63IVkSYVwlTaCMALWQCDxPxZ4nJbVKF1ZfbSYfyP7bohfWwNOYSGfBkagbAks7CLqXSLF8G_kI3Su8A6lIhIpOImjandxxsJCr4CcAi2YqT1ci-BasY8oSq-yomvh5ZtLV2JmvzxLqAmNt6TA2qC2ZShBg/s1252/image3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="790" data-original-width="1252" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsE7A7DZA22LPqdZegJrmA-urYyOLX8vxY63IVkSYVwlTaCMALWQCDxPxZ4nJbVKF1ZfbSYfyP7bohfWwNOYSGfBkagbAks7CLqXSLF8G_kI3Su8A6lIhIpOImjandxxsJCr4CcAi2YqT1ci-BasY8oSq-yomvh5ZtLV2JmvzxLqAmNt6TA2qC2ZShBg/w640-h404/image3.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The process is accessed via the Basedata menu.</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZF9SKZrkdADyWCO9FQmEYXnDnsUfe1PK4kkfr6NCVIDLsyrLsfXOH8dAtuzZky7tvi0UglX7j2uYyO_-a7z9K52__fQ-LRXCd4XqCxr5T_53RC6D9MvlycUfQNIqt_4NITKgPQyqNdQUHTHEsQIwlmml_wl55ERwyJRpkSa9biNNhaGbx4nDooqud9A/s902/image4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="715" data-original-width="902" height="509" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZF9SKZrkdADyWCO9FQmEYXnDnsUfe1PK4kkfr6NCVIDLsyrLsfXOH8dAtuzZky7tvi0UglX7j2uYyO_-a7z9K52__fQ-LRXCd4XqCxr5T_53RC6D9MvlycUfQNIqt_4NITKgPQyqNdQUHTHEsQIwlmml_wl55ERwyJRpkSa9biNNhaGbx4nDooqud9A/w640-h509/image4.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Rasters are selected from a folder at the same time as the options. In this case the mean and standard deviation stats will be attached as properties to the the added to the selected basedata, and the intermediate basedatas will be added to the project so they can be visualised and/or analysed further. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvvKO9fWIOQ0rd7d5170j1Q5LMEhMpqSPIuHCzFpEtsXz11BuJVxCDcpM4WuWp2vF-ha74lnETlEd24_7m9GZXFwKqLOrGlxyaqCmgcYKzs3D2jr2fss6G83TPmFe1k8or2v1xAm_AXloeEK8k7dFiwTRB42pcz3Z8683PXRSb7fx6vpkaHQkEdtBjWg/s499/image5.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="240" data-original-width="499" height="193" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvvKO9fWIOQ0rd7d5170j1Q5LMEhMpqSPIuHCzFpEtsXz11BuJVxCDcpM4WuWp2vF-ha74lnETlEd24_7m9GZXFwKqLOrGlxyaqCmgcYKzs3D2jr2fss6G83TPmFe1k8or2v1xAm_AXloeEK8k7dFiwTRB42pcz3Z8683PXRSb7fx6vpkaHQkEdtBjWg/w400-h193/image5.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The process provides some general feedback when it completes (successfully or otherwise). </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5bnBBdmEv6xxb7RCRbmI_qJWSu8b4zWLC8U_hHgAv_BENcWKgn2eJOmi3_SlG15oah4JkVPBjEsdfBcotX2VY1PAzdJ4zYyIpsU4fdncnXodqS1T5Ziv_SsvywhSz2SkRNGg7beZKsloajcX1a-4AGP-b-srlQNTMo-4T589veyV7d180TzKYNH69yQ/s1252/image6.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="790" data-original-width="1252" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5bnBBdmEv6xxb7RCRbmI_qJWSu8b4zWLC8U_hHgAv_BENcWKgn2eJOmi3_SlG15oah4JkVPBjEsdfBcotX2VY1PAzdJ4zYyIpsU4fdncnXodqS1T5Ziv_SsvywhSz2SkRNGg7beZKsloajcX1a-4AGP-b-srlQNTMo-4T589veyV7d180TzKYNH69yQ/w640-h404/image6.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The outputs tab shows the intermediate basedatas have been added. Each contains a spatial analysis that was used to calculate the statistics. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3VWO6vxAMyedcgRzS1fWNWtFoieHvZomBvy-UO1XlYe2aOKfpKKqFOn_JNtLzss1crp6aZkIu_7AVfEg5jxum9HZ2Q1A4afd6Kgme4VKBTFgcGyvO7ejEdN_mb9sVePSTSadMeDvMJIk4z_KhDblghaH1BwFPQ9k2tOAtyo6lAxuTI-WvHfCai094Sw/s1252/image2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="790" data-original-width="1252" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3VWO6vxAMyedcgRzS1fWNWtFoieHvZomBvy-UO1XlYe2aOKfpKKqFOn_JNtLzss1crp6aZkIu_7AVfEg5jxum9HZ2Q1A4afd6Kgme4VKBTFgcGyvO7ejEdN_mb9sVePSTSadMeDvMJIk4z_KhDblghaH1BwFPQ9k2tOAtyo6lAxuTI-WvHfCai094Sw/w640-h404/image2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The property data cannot be visualised directly (yet). To explore them without using an analysis you need to open the View Labels window for the basedata they were attached to and control click on a cell using your mouse. <br /><br /></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivrktsEy7Qyin0uS5R3mdn72EcKy7Cwuh6HBD3f0ycG1kD1-JMHDW7x-orcUufbqP0-s8o-z9W1tt5o5hx2NVtrSCjPZchV_Hwa5d7QGyybgIJch0YbUkkXNCl3DNTKhtUAHC4lamzmgenrvlfTCh3XIIymcFtvU_iYKe4HKoHV4RIVm05q6kdiS0I5A/s522/image7.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="522" data-original-width="341" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivrktsEy7Qyin0uS5R3mdn72EcKy7Cwuh6HBD3f0ycG1kD1-JMHDW7x-orcUufbqP0-s8o-z9W1tt5o5hx2NVtrSCjPZchV_Hwa5d7QGyybgIJch0YbUkkXNCl3DNTKhtUAHC4lamzmgenrvlfTCh3XIIymcFtvU_iYKe4HKoHV4RIVm05q6kdiS0I5A/w261-h400/image7.png" width="261" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The popup window shows the properties for the cell that was clicked on (you will need to change the list being shown to be Properties).</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHAJEw_kB_K-8426dXccE1u0vKkZHZiPhBycmH2jzk0-hmwhVnJ40OLAvInbThXc_jYldSvNJ_9qgCPXZFQ-R0AN1as_BiYixyx78rBB0t0wf3PiZgQjXiL-RVGI6ZAyl5xR_Gd-zyZ6HNronrNGSf8T3W7fYQzA1Yks-USJsdb_xuTHi46-FAhRx4PA/s1252/image8.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1252" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHAJEw_kB_K-8426dXccE1u0vKkZHZiPhBycmH2jzk0-hmwhVnJ40OLAvInbThXc_jYldSvNJ_9qgCPXZFQ-R0AN1as_BiYixyx78rBB0t0wf3PiZgQjXiL-RVGI6ZAyl5xR_Gd-zyZ6HNronrNGSf8T3W7fYQzA1Yks-USJsdb_xuTHi46-FAhRx4PA/w640-h526/image8.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The group properties can be analysed in a spatial or cluster analysis. Look for the calculations starting with "Group properties" under the Element Properties set. In this case the analyses will follow those linked to the the very top and calculate summary stats and Gi* hotspot stats for each branch in a cluster tree. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUZ6PUlRDkvLolPdqp0IAAYn0kDw6j9Yc1ObSJM8TsYDDP-EbpobIpUMfxbkomiR2YT96K2s8-fpRHx5T1UllPP3XT_cG26VZZwn3__7DhgS5cGxHrUaHXqBYlwBQ14CcoNPZlIPlqixWpLWN9ENO_KgDeZLBUGjt0WHC0Yx0n4GpKTJT6M_LggVFEcg/s1252/image9.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1252" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUZ6PUlRDkvLolPdqp0IAAYn0kDw6j9Yc1ObSJM8TsYDDP-EbpobIpUMfxbkomiR2YT96K2s8-fpRHx5T1UllPP3XT_cG26VZZwn3__7DhgS5cGxHrUaHXqBYlwBQ14CcoNPZlIPlqixWpLWN9ENO_KgDeZLBUGjt0WHC0Yx0n4GpKTJT6M_LggVFEcg/w640-h526/image9.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">And here is a visualisation of the Gi* hotspot stat for branches cut at 0.4744 from the tips (you can slide the blue line to change this value). The interpretation depends on your significance threshold but Gi* scores are z-scores so, for a two-tailed test where values could be high or low, values above 1.96 are hotspots at alpha=0.05, while those below -1.96 are coldspots. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMII0jTi5gheqJxj0IQTFA_wFz-vthWOA1tjhNA0ZtYIE7Hr29neHn5yYmpvPI0le43h1zzTZW8TArB8SNT_2Ebc-oGjvO3aXq2P70sQoEiIqSVbVAPzSFRJ5Yk6ORiKzSuArFRSI0nt_vSYYzg5Z-9QX579RtCtMV4xg3TVux00cfLhu5C2HhXGbHoQ/s1252/image10.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1252" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMII0jTi5gheqJxj0IQTFA_wFz-vthWOA1tjhNA0ZtYIE7Hr29neHn5yYmpvPI0le43h1zzTZW8TArB8SNT_2Ebc-oGjvO3aXq2P70sQoEiIqSVbVAPzSFRJ5Yk6ORiKzSuArFRSI0nt_vSYYzg5Z-9QX579RtCtMV4xg3TVux00cfLhu5C2HhXGbHoQ/w640-h526/image10.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">And here are the same clusters but this time coloured by the mean stat across all groups in the sample. (The naming scheme results in lots of "means").</td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXm7OAvOqYRe6pomvYcjR1deJJYcBpn16aDVdi0vTrkSIlKdCp8oyhxmOM4sDnaFf30CX2PCDpd89X8DewCAuv93GPYTIlw3J1ZSrY-wozQGWtRt5zE0iZogujJR2k3n6h1kqTOa02S4f7YM7iWbn6Xx3Ad3jS6iKe5D01qT3fbtJ5t7nzapnleAEomA/s1252/image11.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1252" height="526" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXm7OAvOqYRe6pomvYcjR1deJJYcBpn16aDVdi0vTrkSIlKdCp8oyhxmOM4sDnaFf30CX2PCDpd89X8DewCAuv93GPYTIlw3J1ZSrY-wozQGWtRt5zE0iZogujJR2k3n6h1kqTOa02S4f7YM7iWbn6Xx3Ad3jS6iKe5D01qT3fbtJ5t7nzapnleAEomA/w640-h526/image11.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">And here is an example of the imported raster data (diurnal range) that were used to generate the group properties. </td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDhY0ROwl9MEUOpEIwC6BxUu1rNb0V0v2EeQ-1RYtOTuXefh22Om8UDlCShN6QcroEpvDdnzPbFW9TU5XGohoyb0hrD83t9qPbVMjGKY9Ujsi00rl8rxueYhQ8thj9n8byVSz67b7IY5x5OQUZBZ3isZZwrV9pFDWh0AAt5HGeZp8UPyR_f5JDPfwy8g/s1252/image12.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1030" data-original-width="1252" height="329" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDhY0ROwl9MEUOpEIwC6BxUu1rNb0V0v2EeQ-1RYtOTuXefh22Om8UDlCShN6QcroEpvDdnzPbFW9TU5XGohoyb0hrD83t9qPbVMjGKY9Ujsi00rl8rxueYhQ8thj9n8byVSz67b7IY5x5OQUZBZ3isZZwrV9pFDWh0AAt5HGeZp8UPyR_f5JDPfwy8g/w400-h329/image12.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">This image demonstrates what can happen when coarse resolution data are used. The 5 arc minute resolution translates to approximately 18 km when projected. The cells in the basedata containing the species observations is 50 km. The system uses raster cell centroid coordinates to allocate their values to a basedata cell and there are clearly alignment offsets here. There are many sources of finer resolution data you can use. </td></tr></tbody></table><br /><p><br /></p><p><br /></p><p><br /></p><p>Shawn Laffan</p><p>02-May-2022</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> </p><div><br /></div><p> </p><div class="separator" style="clear: both; text-align: center;"><br /></div><br />Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-41229328439557829792022-03-12T11:24:00.001+11:002022-07-11T19:40:00.112+10:00Publications using Biodiverse in 20212021 is now in full swing, so here is a list of publications from 2019 that used Biodiverse. <br /><br /><br />If you want to see the full list (155 at the time of writing), then go to <a href="https://www.blogger.com/blog/post/edit/7668982590835732065/3299963236512198630#">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> <br /><br /><br />For more details about Biodiverse, see <a href="https://www.blogger.com/blog/post/edit/7668982590835732065/3299963236512198630#">http://shawnlaffan.github.io/biodiverse/</a> <br /><br /><br />Shawn Laffan<br />12-Mar-2022<br /><br /> <br /><br />Anguiano-Constante, M.A., Dean, E., Starbuck, T., Rodríguez, A. And Munguía-Lino, G. (2021) Diversity, species richness distribution and centers of endemism of Lycianthes (Capsiceae, Solanaceae) in Mexico. <a href="https://doi.org/10.11646/phytotaxa.514.1.3">Phytotaxa, 514, 39-60</a>. <br /><br />Bharti, D.K., Edgecombe, G.D., Karanth, K.P. and Joshi, J. (2021) Spatial patterns of phylogenetic diversity and endemism in the Western Ghats, India: A case study using ancient predatory arthropods. <a href="https://doi.org/10.1002/ece3.8119">Ecology and Evolution, 11, 16499-16513</a>. <br /><br />Camacho, G.P., Loss, A.C., Fisher, B.L., Blaimer, B.B. (2021) Spatial phylogenomics of acrobat ants in Madagascar—Mountains function as cradles for recent diversity and endemism. <a href="https://doi.org/10.1111/jbi.14107">Journal of Biogeography, 48, 1706-1719</a>. <br /><br />Cheikh Albassatneh, M., Escudero, M., Monnet, A‐C., et al. (2021) Spatial patterns of genus‐level phylogenetic endemism in the tree flora of Mediterranean Europe. <a href="https://doi.org/10.1111/ddi.13241">Diversity and Distributions, 27, 913– 928</a>. <br /><br />Earl, C., Belitz, M.W., Laffan, S.W., Barve, V., Barve, N., Soltis, D.E., Allen, J.M., Soltis, P.S., Mishler, B.D., Kawahara, A.Y., & Guralnick, R. (2021) Spatial phylogenetics of butterflies in relation to environmental drivers and angiosperm diversity across North America. <a href="https://doi.org/10.1016/j.isci.2021.102239">iScience, 102239</a>.<br /><br />Flores-Tolentino M., Beltrán-Rodríguez L., Morales-Linares J., et al. (2021) Biogeographic regionalization by spatial and environmental components: Numerical proposal. <a href="https://doi.org/10.1371/journal.pone.0253152">PLoS ONE 16, e0253152</a>. <br /><br />Furtado, S.G. and Menini Neto, L. (2021) What is the role of topographic heterogeneity and climate on the distribution and conservation of vascular epiphytes in the Brazilian Atlantic Forest? <a href="https://doi.org/10.1007/s10531-021-02150-6">Biodiversity and Conservation, 30, 1415–1431</a>. <br /><br />Garcia-Rodriguez, A., Luna-Vega, I., Yáñez-Ordóñez, O., Ramírez-Martínez, J.C., Espinosa, D., and Contreras-Medina, R. (2021). Patrones de Distribución de las Abejas del Bosque Mesófilo de Montaña de la Sierra Madre Oriental, México. <a href="https://doi.org/10.3958/059.046.0425">Southwestern Entomologist, 46, 1021-1036</a>. <br /><br />González-Orozco, C.E. (2021) Biogeographical regionalisation of Colombia: a revised area taxonomy. <a href="https://doi.org/10.11646/phytotaxa.484.3.1">Phytotaxa, 484, 3</a>. <br /><br />González-Orozco, C.E. (2021) Regiones biogeográficas del género Cinchona L. (Rubiaceae- Cinchoneae). <a href="https://doi.org/10.47374/novcol.2021.v16.1987">Revista Novedades Colombianas, 16, 135-156</a>. <br /><br />González-Orozco, C. E., Sosa, C. C., Thornhill, A. H., and Laffan, S. W. (2021). Phylogenetic diversity and conservation of crop wild relatives in Colombia. <a href="https://doi.org/10.1111/eva.13295">Evolutionary Applications, 14, 2603-2617</a>. <br /><br />Gosper, C.R., Coates, D.J., Hopper, S.D., Byrne, M., Yates, C.J. (2021) The role of landscape history in the distribution and conservation of threatened flora in the Southwest Australian Floristic Region. <a href="https://doi.org/10.1093/biolinnean/blaa141">Biological Journal of the Linnean Society, 133, 394–410</a>. <br /><br />Hammer, T.A., Renton, M., Mucina, L. and Thiele, K. (2021) Arid Australia as a source of plant diversity: the origin and climatic evolution of Ptilotus (Amaranthaceae). <a href="https://doi.org/10.1071/SB21012">Australian Systematic Botany, 34, 570-586</a>. <br /><br />Hao, T., Elith, J., Guillera-Arroita, G., Lahoz-Monfort, J. J., & May, T. W. (2021). Enhancing repository fungal data for biogeographic analyses. <a href="https://doi.org/10.1016/j.funeco.2021.101097">Fungal Ecology, 53, 101097</a>. <br /><br />Kougioumoutzis, K., Kokkoris, I.P., Panitsa, M., Kallimanis, A., Strid, A., and Dimopoulos, P. (2021) Plant Endemism Centres and Biodiversity Hotspots in Greece. <a href="https://doi.org/10.3390/biology10020072">Biology, 10, 72</a>. <br /><br />Murali, G., Gumbs, R., Meiri, S. and Rull, U. (2021) Global determinants and conservation of evolutionary and geographic rarity in land vertebrates. <a href="https://doi.org/10.1126/sciadv.abe5582">Science Advances, 7, eabe5582</a>. <br /><br />Ortiz-Brunel, J.P., Munguía-Lino, G., Castro-Castro, A. and Rodríguez, A. (2021) Biogeographic analysis of the American genus Echeandia (Agavoideae: Asparagaceae). <a href="https://doi.org/10.22201/ib.20078706e.2021.92.373">Revista Mexicana de Biodiversidad 92, e923739</a>. <br /><br />Paz, A., Brown, J.L., Cordeiro, C.L.O., Aguirre‐Santoro, J., Assis, C., Amaro, R.C., Raposo do Amaral, F., Bochorny, T., Bacci, L.F., Caddah, M.K., d’Horta, F., Kaehler, M., Lyra, M., Grohmann, C.H., Reginato, M., Silva‐Brandão, K.L., Freitas, A.V.L., Goldenberg, R., Lohmann, L.G., Michelangeli, F.A., Miyaki, C., Rodrigues, M.T., Silva, T.S. and Carnaval, A.C. (2021) Environmental correlates of taxonomic and phylogenetic diversity in the Atlantic Forest. <a href="https://doi.org/10.1111/jbi.14083">Journal of Biogeography, 48, 1377-1391</a>. <br /><br />Pereira, L.C., Chautems, A. and Menini Neto, L. (2021) Biogeography and Conservation of Gesneriaceae in the Serra da Mantiqueira, Southeastern Region of Brazil. <a href="https://doi.org/10.1007/s40415-020-00671-y">Brazilian Journal of Botany, 44, 239–248</a>. <br /><br />Pinedo-Escatel, J.A., Aragón-Parada, J., Dietrich, C.H., Moya-Raygoza, G., Zahniser, J.N. and Portillo, L. (2021) Biogeographical evaluation and conservation assessment of arboreal leafhoppers in the Mexican Transition Zone biodiversity hotspot. <a href="https://doi.org/10.1111/ddi.13254">Diversity and Distributions, 27, 1051-1065</a>. <br /><br />Suissa, J.S., Sundue, M.A. and Testo, W.L. (2021), Mountains, climate and niche heterogeneity explain global patterns of fern diversity. <a href="https://doi.org/10.1111/jbi.14076">Journal of Biogeography, 48, 1296-1308</a>. <br /><br />Yang, X., Liu, B., Bussman, R.W., Guan, X., et al. (2021) Integrated plant diversity hotspots and long-term stable conservation strategies in the unique karst area of southern China under global climate change. <a href="https://doi.org/10.1016/j.foreco.2021.119540">Forest Ecology and Management, 498, 119540</a>. <br /><br />Xu, M.‐Z., Yang, L.‐H., Kong, H.‐H., Wen, F. and Kang, M. (2021) Congruent spatial patterns of species richness and phylogenetic diversity in karst flora: the case study of Primulina (Gesnariaceae). <a href="http://doi.org/10.1111/jse.12558">Journal of Systematics and Evolution, 59, 251-261</a>. <br /><br />Xue, T., Gadagkar, S.H., Albright, T.P., Yang, X., Li, J., Xia, C., Wu, J., and Yu, S. (2021) Prioritizing conservation of biodiversity in an alpine region: Distribution pattern and conservation status of seed plants in the Qinghai-Tibetan Plateau. <a href="https://doi.org/10.1016/j.gecco.2021.e01885">Global Ecology and Conservation, 32, e01885</a>. <br /><br />Zhang, Y., Chen, J. and Sun, H. (2021) Alpine speciation and morphological innovations: revelations from a species-rich genus in the Northern Hemisphere. <a href="https://doi.org/10.1093/aobpla/plab018">AoB PLANTS, 13, 3, plab018</a>. <br /><br />Zhang, Y., Qian, L., Spalink, D., Sun, L., Chen, J. and Sun, H. (2021) Spatial phylogenetics of two topographic extremes of the Hengduan Mountains in southwestern China and its implications for biodiversity conservation. <a href="https://doi.org/10.1016/j.pld.2020.09.001">Plant Diversity, 43, 181-191</a>. <br /><br />Zhu, Z-X, Harris, A.J., Nizamani, M.M., Thornhill, A.H., Scherson, R.A. and Wang, H-F. (2021) Spatial phylogenetics of the native woody plant species in Hainan, China. <a href="https://doi.org/10.1002/ece3.7180">Ecology and Evolution, 11, 2100-2109</a>.<div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-36602607078710510342022-03-12T11:20:00.003+11:002022-03-12T11:20:41.336+11:00Publications using Biodiverse in 2020<p style="text-align: left;">Here is a list of publications from 2020 that used Biodiverse. This is a long overdue post as 2020 is well past.</p><div style="margin-left: 18pt; text-align: left;"><o:p><br /></o:p></div>If you want to see the full list (155 at the time of writing), then go to <a href="https://www.blogger.com/blog/post/edit/7668982590835732065/3299963236512198630#">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> <br /><br /><br />For more details about Biodiverse, see <a href="https://www.blogger.com/blog/post/edit/7668982590835732065/3299963236512198630#">http://shawnlaffan.github.io/biodiverse/</a> <br /><br /><br />Shawn Laffan<br />12-Mar-2022<br /><br /><br />Azevedo, J.A.R., Guedes, T.B., Nogueira, C.d.C., Passos, P., Sawaya, R.J., Prudente, A.L.C., Barbo, F.E., Strüssmann, C., Franco, F.L., Arzamendia, V., Giraudo, A.R., Argôlo, A.J.S., Jansen, M., Zaher, H., Tonini, J.F.R., Faurby, S. & Antonelli, A. (2020) Museums and cradles of diversity are geographically coincident for narrowly distributed Neotropical snakes. <a href="https://doi.org/10.1111/ecog.04815">Ecography, 43, 328-339</a>. <br /><br />Barrera-Robles, P.H., Burgos-Hernández, M., Ruíz-Acevedo, A.D. and Castillo-Campos, G. (2020) The Linaceae family in Mexico: current status and perspectives. <a href="https://doi.org/10.17129/botsci.2550">Botanical Sciences, 98, 560-572</a>. <br /><br />Bein, B., Ebach, M.C., Laffan, S.W., Murphy, D.J. and Cassis, G. (2020) Quantifying vertebrate zoogeographical regions of Australia using geospatial turnover in the species composition of mammals, birds, reptiles and terrestrial amphibians. <a href="https://doi.org/10.11646/zootaxa.4802.1.4">Zootaxa, 4802, 61-81</a>. <br /><br />Brown, JL, Paz, A, Reginato, M, et al. (2020) Seeing the forest through many trees: Multi‐taxon patterns of phylogenetic diversity in the Atlantic Forest hotspot. <a href="https://doi.org/10.1111/ddi.13116">Diversity and Distributions, 26, 1160-1176</a>. <br /><br />Dagallier, L.-P.M.J., Janssens, S.B., Dauby, G., Blach-Overgaard, A., Mackinder, B.A., Droissart, V., Svenning, J.-C., Sosef, M.S.M., Stévart, T., Harris, D.J., Sonké, B., Wieringa, J.J., Hardy, O.J. and Couvreur, T.L.P. (2020) Cradles and museums of generic plant diversity across tropical Africa. <a href="https://doi.org/10.1111/nph.16293">New Phytologist, 225, 2196-2213</a>.<br /><br />Dalrymple, R.L., Kemp, D.J., Flores-Moreno, H., Laffan, S.W., White, T.E., Hemmings, F.A. & Moles, A.T. (2020) Macroecological patterns in flower colour are shaped by both biotic and abiotic factors. <a href="https://doi.org/10.1111/nph.16737">New Phytologist, 228, 1972-1985</a>.<br /><br />González-Orozco, C.E., Sánchez Galán, A.A., Ramos P.E. and Yockteng, R (2020) Exploring the diversity and distribution of crop wild relatives of cacao (Theobroma cacao L.) in Colombia. <a href="https://doi.org/10.1007/s10722-020-00960-1">Genetic Resources and Crop Evolution, 67, 2071–2085</a>. <br /><br />Huang, C., Ebach, M.C. and Ahyong, S. (2020) Bioregionalisation of the freshwater zoogeographical areas of mainland China. <a href="http://dx.doi.org/10.11646/zootaxa.4742.2.3">Zootaxa, 4742, 2</a>.<br /><br />Kougioumoutzis, K., Kokkoris, I.P., Panitsa, M., Trigas, P., Strid, A. and Dimopoulos, P. (2020) Spatial Phylogenetics, Biogeographical Patterns and Conservation Implications of the Endemic Flora of Crete (Aegean, Greece) under Climate Change Scenarios. <a href="https://doi.org/10.3390/biology9080199">Biology, 9, 199</a>.<br /><br />Mienna, I.M., Speed, J.D.M., Bendiksby, M., Thornhill, A.H., Mishler, B.D., Martin, M.D. (2020) Differential patterns of floristic phylogenetic diversity across a post‐glacial landscape. <a href="https://doi.org/10.1111/jbi.13789">Journal of Biogeography, 47, 915-926</a>.<br /><br />Mishler, B.D., Guralnick, R., Soltis, P.S., Smith, S.A., Soltis, D.E., Barve, N., Allen, J.M. and Laffan, S.W. (2020) Spatial Phylogenetics of the North American Flora. <a href="https://doi.org/10.1111/jse.12590">Journal of Systematics and Evolution, 58, 393-405</a>.<br /><br />Moles, A.T., Laffan, S.W., Keighery, M., Tindall, M.L. and Chen, S. (2020) A hairy situation: Plant species in warm, sunny places are more likely to have pubescent leaves. <a href="https://doi.org/10.1111/jbi.13870">Journal of Biogeography, 47, 1934-1944</a>.<br /><br />Moraes, A.M., Milward-de-Azevedo, M.A., Menini Neto, L. et al. (2020) Distribution patterns of Passiflora L. (Passifloraceae s.s.) in the Serra da Mantiqueira, Southeast Brazil. <a href="https://doi.org/10.1007/s40415-020-00665-w">Brazilian Journal of Botany, 43, 999–1012</a>. <br /><br />Paz, A., Reginato, M., Michelangeli, F.A., Goldenberg, R., Caddah, M.K., Aguirre-Santoro, J., Kaehler, M., Lohmann, L.G. & Carnaval, A. (2020) Predicting Patterns of Plant Diversity and Endemism in the Tropics Using Remote Sensing Data: A Study Case from the Brazilian Atlantic Forest. <a href="https://doi.org/10.1007/978-3-030-33157-3_11">Remote Sensing of Plant Biodiversity (eds J. Cavender-Bares, J.A. Gamon & P.A. Townsend), pp. 255-266. Springer, Cham</a>. <br /><br />Ruiz-Sanchez, E., Munguía-Lino, G., Vargas-Amado, G., Rodríguez, A. (2020) Diversity, endemism and conservation status of native Mexican woody bamboos (Poaceae: Bambusoideae: Bambuseae). <a href="https://doi.org/10.1093/botlinnean/boz062">Botanical Journal of the Linnean Society, 192, 281–295</a>.<br /><br />Sosa, V., Vásquez-Cruz, M. and Villarreal-Quintanilla, J.A. (2020) Influence of climate stability on endemism of the vascular plants of the Chihuahuan Desert. <a href="https://doi.org/10.1016/j.jaridenv.2020.104139">Journal of Arid Environments, 177, 104139</a>. <br /><br />Suissa, J.S. and Sundue, M.A. (2020) Diversity Patterns of Neotropical Ferns: Revisiting Tryon’s Centers of Richness and Endemism. <a href="https://doi.org/10.1640/0002-8444-110.4.211">American Fern Journal 110, 211–232</a>. <br /><br />Toro-Núñez, O. and Lira-Noriega, Andrés (2020) Discordant phylogenetic endemism patterns in a recently diversified Brassicaceae lineage from the Atacama Desert: When choices in phylogenetics and species distribution information matter. <a href="https://doi.org/10.1111/jbi.13846">Journal of Biogeography, 47, 1792-1804</a>.Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-46652961674262055312021-09-24T09:33:00.000+10:002021-09-24T09:33:11.411+10:00Faster calculation of PhyloCom indices: NRI, NTI, MPD and MNTD<p>This post covers a bit more of the internals than we usually do. Hopefully it is useful. </p><p>From version 4, Biodiverse will use faster methods to calculate the indices in the PhyloCom set, with ultrametric trees seeing the greatest speedup. These are the NRI, NTI, MPD and MNTD indices, which when not "acronymed" are the Net Related Index, the Nearest Taxon Index, Mean Phylogenetic Distance and Mean Nearest Taxon Distance. These were originally implemented in the phylocom software (<a href="https://dx.doi.org/10.1093/bioinformatics/btn358" target="_blank">Webb et al. 2008)</a> and many readers will be familiar with the R package <a href="https://www.rdocumentation.org/packages/picante/versions/1.8.2" target="_blank">picante</a>. </p><p>The MPD index is a measure of the mean of the paths along the tree between each pair of tips in a sample. The contribution of a branch is proportional to its length and the number of paths it is part of. </p><p>The MNTD index is the mean of the shortest path between each tip in a sample and the nearest tip that is also in the sample. For a sample of ten tips there are only ten paths, but in the naïve case one needs to evaluate all paths to determine which is the shortest. </p><p>One point to keep in mind is that branches in MPD and MNTD are counted one or more times, more specifically as many paths that they form part of. This is in contrast with PD where each branch counts only once. The number of paths also increases quadratically with the number of tips in the sample. For example, if there are ten tips then there will be 10*(10-1)/2=45 paths to connect all pairs, and if there are 10,000 tips then there are 49,995,000 paths. To my mind these make PD a better index, but that discussion is for another day. Certainly it is simpler to calculate.</p><p>The NRI and NTI are z-scores of the MPD and MNTD scores, respectively, and indicate if the paths are longer or shorter than expected given a random resampling of tips across the tree. The resampling algorithm can vary, but the simplest is to use the same number of tips as is found in the sample. In other words it matches the richness of the observed sample, so if one has ten tips in an observed sample then each random iteration draws ten tips at random and calculates the MPD and MNTD score. There are other random sampling algorithms, such as abundance weighted, but Biodiverse only implements the richness approach for NRI and NTI.</p><p>The final NRI z-score is calculated as <span style="font-family: courier;">(observed_MPD - mean_random_MPD) / standard_deviation_random_MPD</span>. Interpretation follows the usual z-score distribution, with values more extreme than +/-1.96 being in the outer 5% of the distribution and thus significant at alpha=0.05. The same process applies to NTI, but using MNTD instead of MPD. (One point of difference between Biodiverse and phylocom is that in positive NRI and NTI values in Biodiverse correspond with values larger than expected, whereas in phylocom these have negative values).</p><p>It is also worth noting that the random resamplings used for NRI and NTI in Biodiverse do not use its more general randomisation framework. One can use the MPD and MNTD scores with such randomisations to try more complex or spatially constrained schemes. For more on such randomisations <a href="https://biodiverse-analysis-software.blogspot.com/search/label/randomisations" target="_blank">see posts with the randomisation tag</a>.</p><p>A key problem to date with the Biodiverse implementation is that these calculations are very slow, and become substantially slower as the size of the data sets increases (trees become deeper and have more tips). The rest of this post describes some of the ways these have been substantially sped up in Biodiverse version 4. Much of this optimisation work was done using code profiling using the excellent <a href="https://metacpan.org/dist/Devel-NYTProf" target="_blank">Devel::NYTProf</a>, and also by implementing the algorithms described in Tsirogriannis et al. (<a href="https://doi-org.wwwproxy1.library.unsw.edu.au/10.1007/978-3-642-33122-0_3" target="_blank">2012</a>, <a href="https://doi.org/10.1007/978-3-662-44753-6_15" target="_blank">2014</a>, <a href="https://doi.org/10.1111/ecog.01814" target="_blank">2016</a>) and implemented in the <a href="https://github.com/constantinosTsirogiannis/PhyloMeasures" target="_blank">PhyloMeasures </a>package (followed by more code profiling with <a href="https://metacpan.org/dist/Devel-NYTProf" target="_blank">Devel::NYTProf</a>).</p><h3 style="text-align: left;">Find the Last Common Ancestor</h3><p>The search for the last common ancestor (or LCA, also referred to as the Most Recent Common Ancestor and Last Shared Ancestor) between a pair of terminal branches is what takes the most time in the MPD and MNTD calculations. This is a key step in calculating the path connecting two tips. Biodiverse has always cached the result of the path distance between a pair of branches so it only needs to be calculated once. However the process of finding the path took a reasonable amount of time, something that was exacerbated when run under the random resampling process. This has been optimised in several ways.</p><p>For ultrametric trees Biodiverse caches the same path distance between each pair of tips that share the LCA. This pre-warming of the cache obviates the need to repeatedly find the same LCA in later checks. This works because, as noted in <a href="https://doi.org/10.1007/978-3-662-44753-6_15" target="_blank">Tsirogiannis et al. (2014)</a>, the distance from an internal branch to any of its tips is always the same for an ultrametric tree.</p><p>For non-ultrametric trees Biodiverse caches the last common ancestor for each pair of tips to save looking for it next time. The distance is not calculated until it is needed, but Biodiverse also caches the cumulative path lengths from each tip to the root so there is no need to repeatedly traverse the tree to get the distance from a tip to the LCA.</p><h3 style="text-align: left;">NRI and NTI</h3><p>Faster calculation of MNTD and MPD are always good, but the real time sink is running the NTI and NRI calculations. Even with faster MPD and MNTD calculations, a calculation that takes 4 seconds for a sample expands to more than an hour when repeated over 999 randomisation iterations. And keep in mind that Biodiverse uses a convergence approach instead of a fixed number of iterations, so more than 2000 iterations is not unusual.</p><h4 style="text-align: left;">Re-use expected values for a given sample size</h4><p>The expected values for a given sample size will not change in any meaningful way across randomisations that have converged on a stable result, so Biodiverse caches these and re-uses them. For example, if the expected values for a sample of 10 has been calculated then it is re-used for each other sample of 10 in the data set. </p><p>This has actually been in Biodiverse since NRI and NTI were first added, but is worth noting. It is an easy thing to implement for other systems.</p><h4 style="text-align: left;">In randomisation analyses</h4><p>The calculation time for NRI and NTI was exacerbated when users ran a randomisation on an analysis that included NRI and NTI. There is really no need to run the NTI and NRI calculations through a randomisation process, as they are based on a random resampling process to begin with (one would be randomising a randomisation). However, this is not always obvious to users. If a user follows a philosophy of "push buttons and watch what happens" (as I do) then a long period of time can be spent waiting for the randomisations to finish as the expected values are recomputed.</p><p>The re-use of expected values described above helps here, but these were only cached within the analysis being run. This means they were not available between randomisations, or to other calculations using the same tree.</p><p>Now Biodiverse caches the calculated expected values on the tree and reuses them whenever the tree is used in a subsequent analysis (unless the cache is cleared). This means they are calculated once only regardless of how many analyses need them.</p><p>But the random resampling process was still too slow...</p><h4 style="text-align: left;">Exact estimates of expected values without needing to randomise</h4><p>The next improvement was to implement the exact algorithms described by Tsirogiannis et al. (<a href="https://doi-org.wwwproxy1.library.unsw.edu.au/10.1007/978-3-642-33122-0_3" target="_blank">2012</a> and <a href="https://doi.org/10.1007/978-3-662-44753-6_15" target="_blank">2014</a>). The <a href="https://github.com/constantinosTsirogiannis/PhyloMeasures" target="_blank">PhyloMeasures</a> package implements additional steps not described in these papers, but which could be extracted from the package C code. This is another great example of the benefits of open source code as one can see and understand how an algorithm is implemented in practice. Where code is complex, or otherwise opaque, one can insert debugging statements in the local version to see what values are being passed into and returned from functions, and how they change within a function. Importantly, one can also build tests to check the new implementation matches the old.</p><p>The exact estimates take advantage of the phylogenetic structure of the trees and repetitions within the calculations. They are comparatively complex but lead to processing times that are many orders of magnitude faster than the random sampling approaches, to the point that analysis times previously measured in days now take seconds. Given they are exact, they also lead to the exact same answer each time they are run so there is no margin of error, even if this is normally very small for values that have converged under a random resampling process.</p><p>The exact NRI algorithms apply to both ultrametric and non-ultrametric trees so are applied in all cases. However, the NTI algorithms only apply to ultrametric trees, so analyses with non-ultrametric trees still require the random resampling approach. It might be possible to develop better approaches for non-ultrametric trees given the main aim is to calculate probabilities from possible combinations, but that needs more experience with combinatorics than I have.</p><h4 style="text-align: left;">MNTD and NTI for the full tree</h4><p>One final optimisation is to implement an algorithm to calculate the MNTD for a sample comprising the full set of tips on a tree. This is perhaps something of an edge case as analyses will usually work with subsets, but it did not take long to implement.</p><p>This optimisation also applies to the NTI because there is only one possible realisation of the MNTD if all the tree tips are in the sample, so the expected mean is same as the MNTD and the standard deviation is zero (we will conveniently ignore the resultant divide by zero error in the z-score calculation in this case). I have not checked if the phylocom, picante and PhyloMeasures implementations check for this condition, but it would not be hard to implement if they do not.</p><p>The algorithm to find the shortest path for each target tip is:</p><p></p><ol style="text-align: left;"><li>Get the shortest distance from the target tip to any of its siblings' tips</li><ol><li>Set this as <span style="font-family: courier;">min_dist</span></li><li><span style="font-family: inherit;">Set the processing node as the target tip</span></li></ol><li><span style="font-family: inherit;">Set the processing node as the processing node's parent</span></li><ol><li><span style="font-family: inherit;">Set the ancestral path distance to be the length from the parent to the target tip</span></li><li>Stop if the ancestral path distance exceeds <span style="font-family: courier;">min_dist</span></li></ol><li>Get the shortest distance from the processing node to its siblings' tips</li><ol><li>Add this distance to the ancestral path distance.</li><li>If the sum of these distances is shorter than <span style="font-family: courier;">min_dist</span> then assign that value to <span style="font-family: courier;">min_dist</span>.</li></ol><li>Stop if the processing node is the root node </li><li>Otherwise go to step 2</li></ol><p style="text-align: left;"><span style="font-family: inherit;">Each check should complete in approximately </span><span style="font-family: courier;">O(log(d))</span><span style="font-family: inherit;"> time, where </span><span style="font-family: courier;">d</span><span style="font-family: inherit;"> is the depth of the tree, so it will take </span><span style="font-family: courier;">O(n </span><span style="font-family: courier;">log(d))</span><span style="font-family: inherit;"> for a tree with </span><span style="font-family: courier;">n</span><span style="font-family: inherit;"> tips</span><span style="font-family: inherit;">. </span></p><p style="text-align: left;"><span style="font-family: inherit;">The calculation of distances to tips for each branch is the cumulative path length from the tip, as noted above for the LCA calculations. This is cached on the first calculation and then re-used, leading to further speed gains over a </span>naïve<span style="font-family: inherit;"> implementation.</span></p><p></p><h2 style="text-align: left;">More?</h2><p>There are potentially more optimisations (there always are) but these will do for now. </p><p>If you want to see all the changes to the internals then they are tracked under issues <a href="https://github.com/shawnlaffan/biodiverse/issues/786" target="_blank">786</a>, <a href="https://github.com/shawnlaffan/biodiverse/issues/788" target="_blank">788</a>, <a href="https://github.com/shawnlaffan/biodiverse/issues/789" target="_blank">789</a>, <a href="https://github.com/shawnlaffan/biodiverse/issues/793" target="_blank">793</a>, <a href="https://github.com/shawnlaffan/biodiverse/issues/794" target="_blank">794</a> and <a href="https://github.com/shawnlaffan/biodiverse/issues/797" target="_blank">797</a>. Suggestions for other approaches are always welcome and can be raised on the mailing list (see link below) or on the <a href="https://github.com/shawnlaffan/biodiverse/issues/" target="_blank">issue tracker</a>. </p><p><br /></p><p>Shawn Laffan</p><p>24-Sep-2021</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> </p><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-74768126914718205452021-09-23T15:12:00.000+10:002021-09-23T15:12:18.531+10:00Label and group property median and percentile statistics are changing<p>Biodiverse supports the analysis of additional data values attached to each label and group. For labels, these could be things like species or population traits such as specific leaf area. For groups these are things like the average phosphorus content in the soil across a group or set of groups. More examples of analysing label traits are in <a href="https://biodiverse-analysis-software.blogspot.com/2018/08/analysing-trait-data.html" target="_blank">this post</a>, and group traits are described in <a href="https://doi.org/10.11646/zootaxa.4802.1.4" target="_blank">Bein et al. (2020)</a>. </p><p>The simplest means of analysing the label and group properties is to calculate summary statistics of the relevant values across the neighbour sets in use. The <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#element-properties" target="_blank">relevant indices for these are under the Element Properties category</a> because labels and groups in Biodiverse are referred to by the more generic term "elements". </p><p>However, the implementation of these summary statistics to date has been relatively inefficient, especially for the range weighted statistics. Where a property was assigned a weight of more than 1, the value was repeated that many times in the vector of values used to calculate the statistics. e.g. [1,1,1,2,2,2,2,2]. This is not an issue for small data sets, but imagine that repeated for 10,000 unique label values, each of which has weights between 1 and 200, and then across 10,000 groups. That can lead to quite some inefficiency with the calculations. This repetition is actually needless given a weighted statistics approach can be used.</p><p>From version 4, Biodiverse uses a weighted implementation for its statistics. This is slightly less efficient for cases where all weights are equal but there are always trade-offs when writing code for the more general case.</p><p>This new approach will have no impact on the results for statistics like the mean, standard deviation, skewness or kurtosis.</p><p>However, there will be a change to the way the percentiles and the median are calculated. Previously the library used would snap to the lowest value when a percentile was calculated that did not exactly align with the data values. The new approach uses interpolation, with the results being consistent with how percentiles are calculated in R (for an unweighted vector). </p><p>This means that any calculations of the median or percentiles in Biodiverse 4 will likely return higher values for some percentiles. The effect will be greater for smaller samples where the numeric gaps between sequential values is larger, but such sample size effects are hardly unusual in statistics.</p><p>One point that is yet to be dealt with is when the weights are not integers, i.e. where the sample counts used for a Basedata are from Species Distribution Model likelihoods (see the <a href="https://bccvl.org.au" target="_blank">BCCVL</a> if you need an online tool to calculate these). In such cases the percentiles cannot use interpolation and will use the centre of mass. Bias correction is also not possible for statistics like standard deviation, skewness and kurtosis in such cases, as the sum of weights is not the same as the number of samples. This issue is for the future, though, as we do not yet support abundance weighted label stats.</p><p>For those interested in the implementation details, the approach uses the <a href="https://metacpan.org/dist/Statistics-Descriptive-PDL" target="_blank">Statistics::Descriptive::PDL</a> package which in turn uses tools provided by the <a href="https://pdl.perl.org" target="_blank">Perl Data Language (PDL)</a>. For those more familiar with R or Python, PDL provides Perl support for fast calculations using matrices and vectors.</p><p><br /></p><p>Shawn Laffan</p><p>23-Sep-2021</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a> </p><p><br /></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a> </p><p><br /></p><p>You can also join the Biodiverse-users mailing list at <a href="https://groups.google.com/group/Biodiverse-users">https://groups.google.com/group/Biodiverse-users</a> </p><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-41421653562170963642020-12-07T14:42:00.004+11:002020-12-18T12:49:29.306+11:00Biodiverse now includes the independent swaps randomisation algorithm <div style="text-align: left;">Randomisations are one of the key analyses in Biodiverse. They give an assessment of whether an observed result is significantly different from a distribution of randomly generated versions. Several recent posts have described how the spatially structured randomisations work, including the (default) <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/randomisations-how-randstructured.html" target="_blank"><span style="font-family: courier;">rand_structured</span> algorithm</a> and its <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/spatially-partition-your-randomisations.html" target="_blank">spatially structured variants</a>. These can also be <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/spatially-partition-your-randomisations.html" target="_blank">spatially constrained</a>. </div><p>These approaches have been available in Biodiverse for some time: <span style="font-family: courier;">rand_structured</span> since the very beginning, spatial constraints since version 1, and the spatially structured variants since version 2. </p><p>One of the general principles when developing Biodiverse is to provide options for users<a href="#ref1">[1]</a>. In that light, version 4 of Biodiverse will include the independent swaps algorithm. </p><h3 style="text-align: left;">How does independent swaps work? </h3><p>A short summary is below, and more details and assessments are in <a href="https://doi.org/10.1890/0012-9658(2000)081[2606:NMAOSC]2.0.CO;2" target="_blank">Gotelli (2000)</a> and <a href="https://doi.org/10.1890/03-0101" target="_blank">Miklos & Podani (2004)</a>. Code for the R implementation is available through the <a href="https://github.com/skembel/picante/blob/master/src/picante.c" target="_blank">Picante package</a>. </p><p></p><ol style="text-align: left;"><li>Pick two different groups at random (call them group1 and group2)</li><li>Pick two different labels at random (call them label1 and label2)</li><li>If label1 is not in group1, or label2 is not in group2, or if label1 is already in group2, or label2 is already in group1 </li><ol><li>Then they cannot be swapped, so switch label1 and label2. </li><li>If the switched pairs cannot be swapped using the condition in #3 then go back to step #1</li></ol><li>Swap the labels between groups</li><li>Increment the number of successful swaps by 1</li><li>Start back at step #1</li></ol><p></p><p><br /></p><p>This is a very simple algorithm, and simple has many advantages when it comes to implementation. However, it is also a brute-force approach. The fact that the pairs are selected at random means there is no consideration of "swappability". Consider the case of a site by species matrix where most of the matrix entries are zero, such as where most taxa are highly range restricted. In that case, there is a very high chance that the algorithm will select a pair that cannot be swapped and must retry with a different selection, e.g. because neither label is in either group. This can lead to many wasted CPU cycles. As an unflattering description one might call this a "throw enough mud at a wall and some of it will stick" approach. Keep in mind, though, that for many cases this will work well enough, and the results are most definitely random - good enough is good enough. </p><p>Another issue with the independent swaps approach is that one can never be quite certain how many iterations are needed to fully randomise a data set. <a href="https://doi.org/10.1890/03-0101" target="_blank">Miklos & Podani (2004)</a> suggest twice the number of non-zero entries in the matrix, but I am not aware of whether that has been rigorously assessed. Too few iterations and some of the original structure will be retained. Too many and the analysis might take too long (even for very patient people)<a href="ref2">[2]</a>.</p><p>There is also no stopping criterion except the total number of iterations. It is therefore possible for the algorithm to be trapped in an infinite loop when none of the group/label pairs can be swapped, as the iteration count is only incremented on a successful swap. </p><p>In light of the above points, the implementation of independent swaps in Biodiverse has three parameters, two of which lead to early stopping. </p><p></p><ol style="text-align: left;"><li>The default number of swaps is set to be twice the number of non-zero cells (if one thinks of the data as a site by species matrix). This value follows <a href="https://doi.org/10.1890/03-0101" target="_blank">Miklos & Podani</a>. Users can specify any positive integer here. The algorithm will stop when this value is reached. </li><li>The maximum number of iterations defaults to 100 times the number of swaps, but can also be specified as a positive integer. If this number is exceeded then the algorithm stops, regardless of the number of swaps completed. (Note that this value will likely be increased in the final Biodiverse 4 release, see below for why). </li><li>The number of times each label/group pair in the original matrix is swapped out is tracked. The algorithm ends once each label/group pair has been swapped at least once. This is off by default.</li></ol><p></p><p>The last case uses the assumption that, once a pair has been randomly swapped, there is no need to swap it any further as it will not be any more random. If your random number generator is not good then more swaps can actually make things worse (but his is unlikely to be a problem as most systems use good random number generators now). </p><h3 style="text-align: left;">A modified independent swaps algorithm</h3><p><a href="https://biodiverse-analysis-software.blogspot.com/2020/11/randomisations-how-randstructured.html" target="_blank">As noted in a previous post</a>, a huge amount of effort has gone into optimising the randomisations in Biodiverse to make them run faster. Armed with this knowledge, the independent swaps algorithm has also been optimised. The primary aim here has been to reduce the search for swappable label/group pairs, at the cost of extra memory to store extra index lists. </p><p>An overview of the algorithm is this:</p><p></p><ol style="text-align: left;"><li>Select a group using a multinomial probability distribution, with the probabilities being the richness scores of each group. Call this <i>group1</i>.</li><li>Select <i>label1</i> from the labels in <i>group1</i> using a uniform random distribution (each label in <i>group1</i> is equiprobable).</li><li>Select a group that does not already contain <i>label1</i>, again using a multinomial probability distribution. Call this <i>group2</i>.</li><li>Select <i>label2</i> from the labels in <i>group2</i> using a uniform random distribution (each label in <i>group2</i> is equiprobable).</li><ol><li>If needed, repeat #4 until <i>label2</i> is not one that is already in <i>group1</i>.</li></ol><li>Swap <i>label1</i> to <i>group2</i>, and <i>label2</i> to <i>group1</i>.</li><li>Update the multinomial probability distributions.</li><li>Increment the swap count.</li><li>Go to step #1.</li></ol><p></p><p>The multinomial distributions are used to replicate the chances of selecting a non-zero matrix entry when selecting entirely at random. The use of the "absence" lists in step #3 also focuses the search to swappable pairs. For example, a very wide ranging label (species) will occur in most groups so there is a high probability that the normal algorithm will select a group1 and group2 that both contain that label.</p><p>There is clearly a memory burden when one considers the "absence" tracking lists. However, memory is cheap these days and the data sets must be relatively enormous for it to make too much difference. Perl, in which Biodiverse is programmed, also has a copy-on-write mechanism to save memory when dealing with large strings (e.g. a copy of a 100MB string can be stored using a link to the original, and only needs to be really copied when it is changed in some way or is written out). </p><p>The modified implementation has the same default stopping parameters as the unmodified implementation.</p><h3 style="text-align: left;">How long do they take?</h3><p>So how fast are they? The table below summarises a benchmark run with the <i>Acacia</i> data set from <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. (2014)</a>. This has 506 species (labels) distributed across 3037 cells (groups). </p><p>Both the original and modified Biodiverse implementations are used, as is that from the R Picante library and a variant implemented in pure R (and run using R version 4.03). All were run on the same computer. The Biodiverse runs used the GUI, the R runs were under RStudio. Code for the R runs is available at <a href="https://github.com/shawnlaffan/biodiverse/tree/master/etc/experiments/independent_swaps" target="_blank">the Biodiverse GitHub repo</a>. </p><p>Only one to five runs of each combination was used, and values are rounded off. Ideally more would be run, but the results are consistent when re-run and the differences are of sufficient magnitude to use for a blog post.</p><p>There are 29,175 non-zero entries in the site by species matrix, so approximately 81% of the matrix is empty. The target number of swaps for the independent swaps runs was set to twice the number of non-empty matrix cells, 58,350, which as noted above is the default in Biodiverse and follows <a href="https://doi.org/10.1890/03-0101" target="_blank">Miklos and Podani (2004)</a>.</p><p>Values in the table are sorted by run time in seconds. The "early stop" column indicates if swapping is stopped once all label/group pairs (matrix cells) have been swapped at least once. "Num swaps" is the number of swaps actually run, "cells swapped" is the count of matrix cells swapped at least once, and is only tracked when early stopping is in use. "Swap attempts" is how many iterations were used overall, including those where a swap was not possible. "max attempts" is the maximum number of swap attempts before giving up, with the default being 100 times the target number of swaps. The larger number is 2^31-1, which is the maximum value for a signed 32 bit integer. </p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGbxm5mf0XKpXH1LOjSKqgB5pUgoSSUr5s8Gi_aFaQRnBJbYpKcnTaTSlH7eyzyc6BH4NxyX-n-DYnElwi53YiNPHJcHQKZt7qKIuZjWXnr4504ARPufDjJpSRj0x0-LbTCN4jadCop7PW/s1148/table2.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="411" data-original-width="1148" height="230" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGbxm5mf0XKpXH1LOjSKqgB5pUgoSSUr5s8Gi_aFaQRnBJbYpKcnTaTSlH7eyzyc6BH4NxyX-n-DYnElwi53YiNPHJcHQKZt7qKIuZjWXnr4504ARPufDjJpSRj0x0-LbTCN4jadCop7PW/w640-h230/table2.png" width="640" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><p>The main takeaway from the table is that the <span style="font-family: courier;">rand_structured</span> algorithm is fastest of all those run, and is <i>much</i> faster than any of the unmodified independent swaps runs. This difference will be more pronounced as the data set size increases. That said, the <span style="font-family: courier;">rand_structured</span> algorithm has been through many rounds of optimisation, so the difference might be reduced with further optimisations to the other algorithms. </p><p>The times for the modified independent swaps are aggregates of iterations 2-5, with the value in brackets being the first iteration when many of the lists are set up and cached for later reuse. </p><p>The modified independent swaps algorithm reaches its target with much less "wastage" of iterations, hence it is fastest of the independent swaps. This is further aided by the early stopping condition. </p><p>For the unmodified independent swaps algorithm, more than 24,600,000 swap attempts are needed to ensure all matrix cells have been swapped. This is ~843 times the number of non-zero matrix cells. The two fastest unmodified runs reach the maximum number of swap attempts when the number of actual swaps is still very low, at ~6% of the target number of swaps.</p><p>The unbounded independent swaps run in Biodiverse takes nearly seven minutes. If used with 999 randomisation iterations in sequence then the run time will be nearly 4.8 days for the randomisations alone (not including running the various analyses for the randomised basedatas). A faster computer could probably reduce it to 3 days sequential time...</p><p>It should also be noted that the Biodiverse implementations are all in Perl. The Picante implementation is written in C++, which is much faster than languages like Perl, R and Python (they themselves are written in C or C++). </p><p>Despite not being written in C, the Biodiverse modified independent swaps algorithm is two to five times faster than the Picante implementation for this data set. The modified algorithms do more work per iteration, but run fewer iterations overall. The Perl implementation could also use be reimplemented in C, but the flexibility of the <span style="font-family: courier;">rand_structured</span> approach (see below) means any such efforts will likely be directed there. </p><p>As a final note regarding speeds, the pure R implementation could be made faster. Profiling shows that most of the processing time is spent in the <span style="font-family: courier;">sample.int()</span> function. Even so, it would still not be faster than the Picante implementation so the only advantage would be if early stopping were implemented.</p><p><br /></p><h3 style="text-align: left;">Spatially structured independent swaps</h3><p>As a final note, an issue with the independent swaps algorithm is that it is very difficult to add spatial constraints to model diffusion processes or random walks. One can apply spatial constraints to the swapping, i.e. an incidence must be swapped with another within some distance, but that has not been implemented. If there is sufficient demand, and perhaps funding or a code contribution, then it could be added.</p><p>Keep in mind, though, that you can apply spatial constraints so any swapping is done within sub-datasets, for example within regions. This process is described in the <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/spatially-partition-your-randomisations.html" target="_blank">spatially constrained randomisation post</a>. </p><p><br /></p><h3 style="text-align: left;">Footnotes</h3><p><a name="ref1">[1]</a> One of the general principles when developing Biodiverse is to provide options for users. This can sometimes lead to a huge array of options, <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices" target="_blank">for example the indices</a>. However, focusing on that example, many of the indices are provided to allow insights into how others are calculated, such as showing the relative contribution of each <a href="https://github.com/shawnlaffan/biodiverse/wiki/Indices#endemism-central-lists" target="_blank">label to an endemism score</a>. </p><p><a name="ref2">[2]</a> Speaking of patient, Biodiverse is written for and by the impatient - impatience is one of the <a href="http://threevirtues.com/" target="_blank">three virtues of programming</a>, after all.</p><p><br /></p><p>Shawn Laffan</p><p>07-Dec-2020</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a></p><p>You can also join the Biodiverse-users mailing list at <a href="http://groups.google.com/group/Biodiverse-users">http://groups.google.com/group/Biodiverse-users</a></p><p><br /></p><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-36317490366352908112020-11-23T13:21:00.000+11:002020-11-23T13:21:04.764+11:00Spatially partition your randomisations<p>In previous posts I described how the <span style="font-family: courier;">rand_structured</span> algorithm works, and then how one can use spatially structured randomisations to compare observed patterns with stricter null models. </p><p>This post is concerned with how one can spatially partition a randomisation. Some of these details are in a <a href="https://biodiverse-analysis-software.blogspot.com/2015/06/better-control-of-randomisations.html" target="_blank">previous blog post</a>, but this post is a much more detailed description. </p><p>In the standard implementation of a randomisation, in Biodiverse and I suspect more generally also, labels are randomly allocated to any group across the entire data set. This works well for many studies and is very effective. However, when one begins to scale analyses to larger extents, for example North America, Asia or globally, then the total pool of labels begins to span many different environments. The randomisations could allocate a polar taxon to the tropics, and a desert taxon to a rainforest. This does not make the randomisation invalid, but it is perhaps not as strict as it could be. </p><p>The effect is perhaps best considered with a scenario using phylogenetic diversity where polar and tropical taxa are distinct clades on the tree. For a randomisation that allocates labels anywhere across the data set, one will commonly end up with a random realisation containing many groups with a mixture of polar and tropical taxa. The PD on the random data set will be higher than on the observed data set in almost all cases because both clades are sampled and thus more of the tree is represented. Note that this does not make the result wrong - it means that, when compared with a random sample from the full pool of taxa, the observed PD is less than expected. That is useful to know, but the next question to ask is "are the patterns in the tropics higher or lower than expected for the tropics?", and the same for the polar taxa. </p><p>One could quite easily remove any groups outside a region of interest and then rerun the analysis. However, then the ranges of the labels will be reduced and any endemism analyses will not be comparable with the larger analyses. </p><p>This can be readily fixed in Biodiverse by specifying a s<i>patial condition to define subsets</i>. Or, if you only want to randomise a subset of your data while holding the rest as-observed, you can use a <i>definition query</i>. The spatial condition approach was used in <a href="https://doi.org/10.1111/jse.12590" target="_blank">Mishler et al. (2020)</a> to (very approximately) partition Canada, the US and Mexico to assess sensitivity of the results for the full data set. The definition query was used in <a href="https://doi.org/10.1016/j.isci.2018.12.002" target="_blank">Allen et al. (2019)</a> to only randomise values within Florida, while holding the adjacent regions constant.</p><p>How does it work? Both approaches slice the data into subsets, apply the randomisation algorithm to each subset independently, and then reassemble the randomised subsets into a full randomised basedata that is then used for the comparisons. The difference for the definition query is that the data is divided into two sets, and only the subset that passes the query is randomised.</p><p>The definition query is run first. This means that, if you specify both a definition query and a spatial condition, then only the groups that pass the definition query are considered for randomisations, and those randomisations will be spatially partitioned.</p><p>This approach is general, and works for any of the randomisation algorithms in Biodiverse. Note, however, it will not apply different randomisation algorithms in different subsets. If there is a good reason to do so then we can look at it, but it will make the interface much more complex to implement. </p><p>One other point to note is that, for the structured randomisations, the swapping algorithm is applied within the subsets. Each subset is run to completion before they are all aggregated.</p><p><br /></p><p>Some images will work better than text, so here are some clipped screenshots from Biodiverse.</p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEo3SBTTCIlfVDpYtF5fosdZNr5bRI8Kh33Uh3Y1yVz3LKToowJlhnyJqSbw4JtGbpconMkjssXvEyebKtLIU2jv4h0fQfj5gpISwJw12BMDzfnh51K02snOgDsDG5tsW7zSvYqXt3tr1o/s870/A_tenuissima_observed.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="861" data-original-width="870" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEo3SBTTCIlfVDpYtF5fosdZNr5bRI8Kh33Uh3Y1yVz3LKToowJlhnyJqSbw4JtGbpconMkjssXvEyebKtLIU2jv4h0fQfj5gpISwJw12BMDzfnh51K02snOgDsDG5tsW7zSvYqXt3tr1o/s320/A_tenuissima_observed.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The observed distribution of <i>Acacia tenuissima</i> (red cells), with outlines of Australian states and territories in blue. <i>Acacia</i> data are from <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. 2014</a>, and the cell size is 50 km. <br /></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDOQ7aleY7mYuylTIQPgLt02Ml95-oSW1Z4hKjgWtFBXsc7m4fe9AXE1Rp1EsD7q4NKYnyFaOjNUk75Nxn4xuZpSefsAkIn4w-E9wDh7yzc9rDf_hOZEGy87XnF3j5isjHtXmLZ2ld_SrF/s883/A_tenuissima_rand.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="840" data-original-width="883" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDOQ7aleY7mYuylTIQPgLt02Ml95-oSW1Z4hKjgWtFBXsc7m4fe9AXE1Rp1EsD7q4NKYnyFaOjNUk75Nxn4xuZpSefsAkIn4w-E9wDh7yzc9rDf_hOZEGy87XnF3j5isjHtXmLZ2ld_SrF/s320/A_tenuissima_rand.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><i>A tenuissima</i> incidences in a randomisation constrained such that each incidence is allocated within the polygon that contains it. Note the absence of incidences in NSW, Victoria and Tasmania, and that there are only three in South Australia. This is consistent with the observed data.<br /></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5YWFoaB0eOqeWfz4zOwA8QifOrJ-xJXy5XcaGdXqsUBvHx7W2UNTHlKJRb1pMB_V6SgWSk4tzy_7XQe8WjG5VH4cPowrzSaJL4jpiAHUdpOa1bq01SD_DUPCrhz0oEaWFsJ9NsYCLmb7g/s857/A_tenuissima_rand_defq.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="821" data-original-width="857" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5YWFoaB0eOqeWfz4zOwA8QifOrJ-xJXy5XcaGdXqsUBvHx7W2UNTHlKJRb1pMB_V6SgWSk4tzy_7XQe8WjG5VH4cPowrzSaJL4jpiAHUdpOa1bq01SD_DUPCrhz0oEaWFsJ9NsYCLmb7g/s320/A_tenuissima_rand_defq.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><i>A tenuissima</i> records randomised using a definition query so that only incidences within Western Australia are randomised. All others are kept unchanged. Compare with the observed distribution above. <br /></td></tr></tbody></table><br /><p><br /></p><p>So how does one use it? It is done as part of the standard interface (this functionality has actually been in Biodiverse since version 1, but the interface was slightly reconfigured in version 2). The user specifies spatial conditions using the <a href="https://github.com/shawnlaffan/biodiverse/wiki/SpatialConditions" target="_blank">same syntax as for the spatial analyses</a>. </p><p><br /></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiRYATlCava8rfoeLHyQ-UD4EfDAhqF7UpyYILwaJ3cPOntsWIJihGwoCBePQYwSFf1iFfh4f88hvRMpiWHHVN9M01d4kZIoNSZeMMt9BJP7W6DZU2UH8JRnLzDgSokgMspQC3aq-VQhPp/s2004/screenshot.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1264" data-original-width="2004" height="405" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiRYATlCava8rfoeLHyQ-UD4EfDAhqF7UpyYILwaJ3cPOntsWIJihGwoCBePQYwSFf1iFfh4f88hvRMpiWHHVN9M01d4kZIoNSZeMMt9BJP7W6DZU2UH8JRnLzDgSokgMspQC3aq-VQhPp/w640-h405/screenshot.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Subsets to randomise independently are defined using a spatial condition (red arrow), while the definition query (blue arrow) is used to randomise a subset of the data.<br /></td></tr></tbody></table><br /><p>The choice of condition is up to the user, and they can specify whatever condition they like (it is their analysis, after all...). Generally speaking, though, it is better to use a condition that generates non-overlapping regions. One that uses a shapefile would fit with many geographic cases. Many conditions will work but might not be very sensible, for example overlapping circles. Each group is allocated to only one subset, so in such overlapping cases it is the first one that contains the group "wins" the group.</p><p>One thing to watch for, and which is <a href="https://github.com/shawnlaffan/biodiverse/issues/780" target="_blank">fixed for version 4</a>, is that if the spatial condition does not capture all groups then the system will throw an error. The workaround for version 3 and earlier is to use a definition query that matches (shadows) the spatial condition, although keep in mind that any groups outside the definition query will not be randomised.</p><p>To sum up, Biodiverse allows a high degree of complexity and fine grained control when using randomisations to assess the significance of observed patterns. This post has described the spatial conditions and definition queries. More features and approaches are described in other posts, <a href="https://biodiverse-analysis-software.blogspot.com/search/label/randomisations" target="_blank">and are grouped under the randomisation tag</a>. </p><p><br /></p><p><br /></p><p>Shawn Laffan</p><p>23-Nov-2020</p><p><br /></p><p>For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/</p><p>To see what else Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</p><p>You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users</p><p><br /></p>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0tag:blogger.com,1999:blog-7668982590835732065.post-16985238085720942302020-11-19T12:02:00.000+11:002020-11-19T12:02:19.602+11:00Randomisations - modelling spatial structure<p>In the <a href="https://biodiverse-analysis-software.blogspot.com/2020/11/randomisations-how-randstructured.html" target="_blank">previous post</a>, I described how the <span style="font-family: courier;">rand_structured</span> algorithm works. If you don't feel like reading that post right now, then a short description is that it uses a filling algorithm to randomly allocate the labels in a basedata onto the groups of a new basedata, with the end result being random assemblages in each group, and where each group has the same number of labels as the original basedata by default. A swapping algorithm is then used to allocate any labels that could not be allocated using the filling process, usually because there were no unfilled groups that did not already contain these labels.</p><p>The filling algorithm has some nice advantages over other methods that also match the richness patterns, such as the independent swaps algorithm. The main one is that filling algorithms are easily generalised to model spatial processes in their allocation, for example random walks and diffusion processes.</p><p>The videos below show some of these spatially structured randomisations at work, using <i>Acacia tetragonophylla</i> incidences from <a href="https://doi.org/10.1038/ncomms5473" target="_blank">Mishler et al. (2014)</a>. Note that the shade of red is used only to indicate when in the sequence a cell was allocated to, with lighter shades being earlier and darker later.</p><p>The first video shows a random walk model, in which a seed location is chosen for the first allocation. The next location is a randomly selected neighbour of the current location, and the process repeats. If a location has no available neighbours (all are full or already contain this label) then it backtracks until it finds one it can allocate to (backtracking can be from the last allocated, the first, or randomly selected). If there are no possible allocations then it will try a new seed location. One can also set a probability to randomly reseed even if the window has not be fully allocated to. </p><p><br /></p><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dyg8oNEH3sxRg3WcYsAT4Ks97ZxZikCOJgLx6oufgWBJXdTEF6AIzupqnYgKGnzIKCkPTf0-8ErcEwmAqnW-A' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">The second uses a diffusion process. This also uses a seed location, but considers the neighbours of all of its current locations for the next allocation. As with the random walk, this process is also constrained by the filled and previously allocated groups, and will reseed if necessary or at random. There is no backtracking as it is not needed since there is no "path" being followed. </div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dxqj89T0pJyI0A5auUFkNWSH9uo7OMbBbjIyT5uLqb87MVlXowAxEoMc1RjyKAT1H31b1yKzv8y7AgrAeXsow' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">The final model is a proximity allocation process. A seed location is chosen, and a <a href="https://github.com/shawnlaffan/biodiverse/wiki/SpatialConditions" target="_blank">spatial condition</a> is then used to select a window of neighbours. The example here uses a circle of a three cell radius, but any supported condition could be used such as predefined regions represented using a shapefile. Once all groups in the window have been allocated to, excluding fully allocated groups, a new seed location is chosen. </div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dyUk_9z6pYxhvGgGFhADPDujz3ADEhvwkq3C7OVwpZOhce_vhyOYVML2B5HUeZHNO-irPevnXAys-6g4l0mIg' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">As with the main <span style="font-family: courier;">rand_structured</span> algorithm, any unallocated incidences are handled using a swapping algorithm (<a href="https://biodiverse-analysis-software.blogspot.com/2020/11/randomisations-how-randstructured.html" target="_blank">see details in the previous post</a>). This ensures richness targets are met, at the cost of some loss of spatial contiguity in the randomly generated distributions. </div><div class="separator" style="clear: both; text-align: left;"><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeXjpeJAh1h9QdjtjkIu5iGC5yMd05Txpu7k9s7t6IXdWebYSiMgYkGCv5KgvdgazKw9FYC7xyQDQuDFAhyphenhyphenxMueOueymnUBhWYDYqe2QS4u_K3XVpWDNXOYeZKOMBpC24_Y0-97LvTay1Z/s412/Picture1.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="371" data-original-width="412" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeXjpeJAh1h9QdjtjkIu5iGC5yMd05Txpu7k9s7t6IXdWebYSiMgYkGCv5KgvdgazKw9FYC7xyQDQuDFAhyphenhyphenxMueOueymnUBhWYDYqe2QS4u_K3XVpWDNXOYeZKOMBpC24_Y0-97LvTay1Z/w400-h360/Picture1.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The outliers for this proximity allocation randomisation are due to the swapping algorithm used to reach the per-group richness targets.<br /></td></tr></tbody></table><br /><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both;">The above randomisations are all implemented under the <span style="font-family: courier;">rand_spatially_structured</span> randomisation, for which parameters can be set to model each approach. However, setting parameters can be complicated and error prone so there are variants called <span style="font-family: courier;">rand_diffusion</span> and <span style="font-family: courier;">rand_random_walk</span> that restrict the set of arguments needed and set some useful defaults, but otherwise are just wrappers around <span style="font-family: courier;">rand_spatially_structured </span>(which itself is implemented in Biodiverse as a particular case of the <span style="font-family: courier;">rand_structured</span> algorithm). There is not currently a wrapper for the proximity allocation, but one could be added if needed. The next few screenshots show the parameter settings. <span style="font-family: courier;"> </span></div><p><br /></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidGk4pEdb4WBnHwGWMpuWFlVRfKJ3jASnKEsji0To-Fij91l96FJ4gfdYNSBGl5LH7Kf6ZtkHXFgDaVTS7DuDKbF8LHK7g7Wqqr8enNNC3pV7NkyaGvVwIgsbi2GbWrbsauW2pkMA8PArZ/s2048/interface1.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1088" data-original-width="2048" height="341" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidGk4pEdb4WBnHwGWMpuWFlVRfKJ3jASnKEsji0To-Fij91l96FJ4gfdYNSBGl5LH7Kf6ZtkHXFgDaVTS7DuDKbF8LHK7g7Wqqr8enNNC3pV7NkyaGvVwIgsbi2GbWrbsauW2pkMA8PArZ/w640-h341/interface1.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The options to control the spatially structured randomisations are marked with the red and blue arrows. The spatial condition (marked by the red arrow) determines which neighbours of the considered group will be allocated to next. This example uses a square, <a href="https://github.com/shawnlaffan/biodiverse/wiki/SpatialConditions" target="_blank">but it could be any of the supported conditions</a>. The other parameters (marked with the blue arrow) control reseeding, backtracking, and allocation order. <br /></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhp3cRTVE3ltCLxA40Qo6vLGAYb2zv1N7LR-bEkn4-iLFy0JGHS-XnexT0PV3AAYSZQ-wMFSvPyU-NtbgGal-W_0tXgq_Ktu0RibytXFYtPhMTdyycKipzdCOY26t_X4d-nnpn9t9AebPAn/s2048/interface_diffusion.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1088" data-original-width="2048" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhp3cRTVE3ltCLxA40Qo6vLGAYb2zv1N7LR-bEkn4-iLFy0JGHS-XnexT0PV3AAYSZQ-wMFSvPyU-NtbgGal-W_0tXgq_Ktu0RibytXFYtPhMTdyycKipzdCOY26t_X4d-nnpn9t9AebPAn/w640-h340/interface_diffusion.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The <span style="font-family: courier;">rand_diffusion </span>parameters are almost identical to those for <span style="font-family: courier;">rand_spatially_structured</span>, as the former is just a special case of the latter and this simplifies setting it up. <br /></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdUg8eXpsFu3C7d87J9BqEugEd7E_kY4JPMkEbNt0skwk_DNHT2n89S-jbmCdaRdmliBQ6khCw5amQFl10QkC4LMDVmp4FCmswqHThWHQC5hP_99rGeqcX6Earj36BjhQ7Gr7wQsv7iWkz/s2048/interface_random_walk.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1088" data-original-width="2048" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdUg8eXpsFu3C7d87J9BqEugEd7E_kY4JPMkEbNt0skwk_DNHT2n89S-jbmCdaRdmliBQ6khCw5amQFl10QkC4LMDVmp4FCmswqHThWHQC5hP_99rGeqcX6Earj36BjhQ7Gr7wQsv7iWkz/w640-h340/interface_random_walk.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">As with the <span style="font-family: courier;">rand_diffusion</span> parameters, and for the same reason, the parameters for <span style="font-family: courier;">rand_</span><span style="font-family: courier;">random_walk</span> are also almost identical to those for <span style="font-family: courier;">rand_spatially_structured</span>. <br /></td></tr></tbody></table><p>As an aside, if you want a good introduction to the general process of dynamic spatial modelling, then have a look at <a href="https://www.wiley.com/en-au/Spatial+Simulation%3A+Exploring+Pattern+and+Process-p-9781118527078" target="_blank">O'Sullivan and Perry (2013)</a>. There is also a good introduction to the limits of the complete spatial randomness model in <a href="http://dx.doi.org/10.1002/9780470549094" target="_blank">O'Sullivan and Unwin (2010)</a>. </p><p>To sum up, there is more to randomisations than just randomly shuffling the data around. Biodiversity patterns are normally spatially structured, so finding patterns that are significantly different from those that are completely random is often not much of a challenge. Adding constraints such as matching species richness patterns is very useful, but one can go further and introduce even more spatial structure, making the tests "harder to pass" (an example is in <a href="https://doi.org/10.1046/j.1365-2699.2003.00875.x" target="_blank">Laffan and Crisp, 2003</a>). As noted by <a href="https://doi.org/10.1046/j.1466-822X.2001.00249.x" target="_blank">Gotelli (2001)</a>, trying multiple hammers is a useful thing, and to quote that paper: "The advantage of null models is that they provide flexibility and
specificity that cannot often be obtained with conventional statistical
analyses". Using multiple approaches might be regarded as a fishing expedition, but it does allow one to generate a distribution of models and thus deeper understanding of the patterns one is analysing. And fish are nutritious.</p><p><br /></p><p>Shawn Laffan</p><p>19-Nov-2020</p><p><br /></p><p>For more details about Biodiverse, see <a href="http://shawnlaffan.github.io/biodiverse/">http://shawnlaffan.github.io/biodiverse/</a></p><p>To see what else Biodiverse has been used for, see <a href="https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList">https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList</a></p><p>You can also join the Biodiverse-users mailing list at <a href="http://groups.google.com/group/Biodiverse-users">http://groups.google.com/group/Biodiverse-users</a></p><div><br /></div>Shawn Laffanhttp://www.blogger.com/profile/02109693873146746125noreply@blogger.com0