Biodiverse analysis software: 2015

Copy selected labels to the clipboard

This is just a short post.

As of Biodiverse version 1.0_001, the View Labels tab now allows users to copy the selected set of labels to the clipboard. Amongst all the traditional uses, one can copy the selected set into the new randomisation option where one can hold some of the labels constant.

The selection menu now has an option to copy the selected set to the clipboard.

Shawn Laffan, 23-Jun-2015

For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 1.0 series see https://purl.org/biodiverse/wiki/ReleaseNotes#version-101

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

Better control of randomisations

Randomisations are used in Biodiverse to assess whether the analysis results are more extreme than expected given some null model. Below is a short summary of the existing randomisations, then a description of some of the new functionality in version 1.0_001 to allow greater control by the user.

Currently there are three null models that can be used, with more in the planning stages.

In the rand_nochange case, the groups and labels are held constant. This is not a randomisation so far as the basedata is concerned, but is useful when one wants to randomise the label and/or groups properties, or the trees, while holding everything else constant. This allows one to disentangle the effects of the spatial data from the trees or properties.
The rand_csr_by_group option takes the contents of each group and randomly assigns them to some other group. If one is working with biotic data then it is analogous to a case where the set of assemblages across the observed groups is held constant, but these assemblages are located randomly.
The rand_structured randomisation allocates labels randomly across the landscape, but keeps the label (e.g. species) richness of each randomly generated group constant within some specified tolerances. If the richness_multiplier and richness_addition parameters are set to 1 and 0 respectively, then the randomised basedatas will have exactly the same richness patterns as the observed. This is the randomisation used in the CANAPE analyses.

The new functionality, available in version 1.01, allows the user to control which subsets of the basedata are randomised.

Keep some locations constant

In the first case, one can specify a list of labels which are to be held constant across the randomisations. That is, the labels specified in that list retain their observed distributions, while the remainder of the labels are randomly re-assigned using whichever randomisation function is chosen.

Randomise within regions

In the second case, one can specify a spatial condition which forces the randomisations to be applied within subsets. For example, if one specifies a shapefile condition such as sp_point_in_same_poly() then the labels within each polygon will be randomised such that they stay within that polygon. This allows one to, for example, keep any labels (e.g. taxa) within the biome within which they are found while still randomly locating them. If taxa span biomes then the sets within each biome are randomised independently.

This process is also best applied using a spatial condition which is non-overlapping, i.e. sp_block or sp_point_in_same_poly(). If one uses a condition such as sp_circle, in which groups are in multiple neighbour sets, then the system simply assigns each group to the first neighbour set in which it is found. (The system processes groups in an alphabetical sort order). This is probably not what is wanted in most cases.

Keep some labels constant

Finally, one can hold one or more of the labels constant. An example use for this is if you have a large tree and want to keep one clade's distribution constant while randomising everything else.

These are specified as lists, with one label per line. If you have a large number of labels, or simply want to avoid typing species names (a good thing to avoid if you have species ending in aea and/or eae) then you can copy and paste from one of the popups when you control-click on a cell. You can also copy the selected set from the view labels tab (also new in 1.0_001).

In the example above, labels (species) will be randomised within the biomes in which their group is found. Only those groups (cells) whose y-coordinate is less than 1,650,000 will be randomised. Anything greater than (north of) that will be held constant. Finally, Genus:sp11, Genus:sp12 and Genus:sp13 will be also not randomised. All other labels will be. This example uses the rand_structured randomisation, so the label ranges and group richness scores will be exactly the same across all randomisation iterations.

Shawn Laffan, 23-Jun-2015

For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 1.0 series see https://purl.org/biodiverse/wiki/ReleaseNotes#version-101

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

Import your species data from spreadsheets

Biodiverse has always imported data from delimited text files, for example using the Comma Separated Variable (CSV) format. Support for rasters and shapefiles (point data only) was added in version 1. However, data is commonly collated and sent around using spreadsheets.

If one has data in a spreadsheet then, in version 1 of Biodiverse and earlier, one has to export the data from the spreadsheet to CSV format. This rapidly becomes annoying when one is updating the spreadsheet, as one repeatedly needs to export the data. One thing we try to avoid in developing Biodiverse is annoyance.

As of version 1.0_001, you can import spreadsheet data. This uses the same process as for text files, except for an additional selection of the data sheet to use within the workbook. A couple of screenshots are below to illustrate the process. Anyone familiar with importing data into Biodiverse will note how similar it is to existing processes.

The formats supported are Microsoft Excel (.xls and .xlsx formats) and LibreOffice (.ods).

As with text files, one can import multiple spreadsheets, and the same columns will be used from each. However, there is the limitation that the same worksheet selection will be used for all selected files. If you have your data across multiple spreadsheets, but the structure is not consistent, then you need to repeat the import process multiple times.

You can choose to import spreadsheet files on the first page of the data import process.

The selection options are the same as for text imports, except the rename and property options are not available. These can be added later using the BaseData menu in the GUI.

Select which sheet is to be used from the spreadsheet. If you have selected multiple spreadsheets then then this selection will be applied to all of them.

The rest of the import options are the same as for text imports.

This is the same last step in the text import process.

This functionality is available in the 1.0_001 development release which is now available. Please give it a try and report any success or issues. You can do this by commenting below, or by using the mailing list or the issue tracker.

One known problem is that the process takes a long time for large spreadsheets (300,000 rows). In such cases it is faster to save your data to CSV format and import using the Delimited Text format.

Shawn Laffan, 22-Jun-2015

For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 1.0 series see https://purl.org/biodiverse/wiki/ReleaseNotes#version-101

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users

Tuesday, 23 June 2015

Copy selected labels to the clipboard

Better control of randomisations

Keep some locations constant

Randomise within regions

Keep some labels constant

Import your species data from spreadsheets