Thursday, 12 November 2020

Randomisations now also generate z-scores

Randomisations in Biodiverse have always generated rank-relative scores from which significance can be derived. See for example this previous post: https://biodiverse-analysis-software.blogspot.com/2016/08/easier-to-use-randomisation-results.html 

However, sometimes users want to use z-scores to assess a randomisation. Several other tools such as phylocom also calculate z-scores, and often these are the default.

From version 4 of Biodiverse, z-scores are now also be calculated for randomisations. More details about what happens are in the updated help page, but a short summary is given here. 

For each iteration of the randomisation, each analysis is regenerated using the same set of spatial conditions, but using the randomised version of the basedata. The values of each index for each group (cell) are then checked, and running tallies are kept for the sum of randomised index values as well as the sum of squared index values. These sums, along with the count of iterations, allows the calculation of the mean (sum of x / count) and standard deviation using the running variance method. 

Once the randomisations have completed, the z-score for each index value for each group can be calculated in the usual way as (observed - expected / sqrt (variance)). Interpretation is the same as any other z-score, with values outside the interval [-1.96,1.96] being significant for two tailed test with an alpha of 0.05, providing the number of samples is large. Note that the strictly correct value depends on the degrees of freedom, which depends on the number of samples. But why would you run a small number of iterations? 

The z-scores are calculated automatically whenever a randomisation is run.  The results are collated for each group and are accessible through the ">>z_scores>>" lists.  In this case Rand1 is the name of the randomisation, hence the list name is Rand1>>z_scores>>SPATIAL_RESULTS. 



If you want to add z-scores to an existing randomisation then you will need to run at least one extra randomisation iteration to trigger their calculation. If there is demand to do so without the extra iteration then we can add an option for it.  (Consider the relative numeric difference between 999 and 1000 iterations, without worrying about aesthetics such as the number of observed plus random realisations being a multiple of 10).  


Shawn Laffan, 12-Nov-2020



For more details about Biodiverse, see http://shawnlaffan.github.io/biodiverse/ 

For the full list of changes in the 3.99 development series, leading to version 4, see https://github.com/shawnlaffan/biodiverse/wiki/ReleaseNotes#version-399-dev-series 

To see what else Biodiverse has been used for, see https://github.com/shawnlaffan/biodiverse/wiki/PublicationsList 

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

No comments:

Post a comment

Note: only a member of this blog may post a comment.