Monday 15 December 2014

Label selections in the View Labels tab

The view Labels tab in Biodiverse is where you can interactively visualise the distribution of your species (or other) data in geographic space, on a tree and against a matrix of pairwise values.  Clicking (selecting) map or matrix cells, tree branches or rows in the label list highlights the distribution of these labels across each of the other panels.

This has been in Biodiverse since its first release, and is really useful because you can easily identify outliers, gappy distributions, or simply gain an understanding of how your data are distributed. See here for an overview from an earlier version.

As part of the development work towards version 1.0, the View Labels tab has been enhanced with a number of selection related features.  Some of these were already available in the 0.99_006 development release, while the rest are be in 0.99_007 (which, if there are no show stopping bugs, will be the last of the 0.99 series before 1.0 is released).  This follows other enhancements to this tab, such as the export menu, and the pan and zoom tools

The features are listed here, with details below. 
  1. Selections can now be added to and removed from, rather than being a new set every time. Selections can be also switched (inverted). 
  2. Labels can be selected using text matching.
  3. Selected labels can be deleted.
  4. New basedata objects can be created from the selected set, or its complement.  
  5. Selected labels can be exported.
It is worth noting that these operations work on all groups (cells) containing the selected labels.  We don't yet have tools to work on selected labels across a subset of groups, although some combination of label selections and a definition query in the Run Exclusions dialogue could be used here (I need to blog about those updates separately). 

Selections can be added to or removed from, and switched

Previously in Biodiverse, selecting labels in any of the grid, tree or matrix panes would generate a new selection every time.  The only way to add or remove labels from the selection was to control-click on the rows in the label list at the top left. 

Now users can choose from three selection modes, "new", "add_to" and "remove_from".  These work exactly as named.  So, for example, one can select a clade in the tree, change the mode to remove_from, draw a box around a set of cells in the grid and any label in these cells will be removed from the selected set.

The switch selection simply inverts the section, so any selected records become unselected while any unselected ones become selected.  This is most useful when you need all but a small number of records to be selected, and it is easier to select the small number first.

(These options are another of those feature sets which are already in many other software packages, but it is good to provide a user interface many people are already used to).

Users can now choose from three selection modes, as well as switch the selected set

Selections can use text matching

Biodiverse now also supports the ability to select labels using text matching.  This can use part of the word, or the whole word.  It also uses regular expressions, so you can build matches that are as complex as you need.

As an example, say you have records for species in the genuses Acacia, Daviesia and Gastrolobium in a data set, with the genus name included in each label, but you are only interested in the distribution of Gastrolobium.  All you need do is set "Gastrolobium" as the match to select all records containing that name.  You can then see where Gastrolobium records are distributed across the map, tree and matrix.  If needed, you can also then delete or export these records, or make a new basedata (see the next few sections below for details).

The interface allows you to override the current selection mode if need be, but it defaults to whatever the current mode is (new, add_to or remove_from).  Choosing a full match will select only  labels that match the text exactly, negating the selection will select any label that does not match, while case insensitive matching will ignore the case (so "cac" will match all of "Cactus", "cacaphony" and "ICAC").


The text selection above will select any label containing the text sp1, so for the example data distributed with Biodiverse this will select Genus:sp1, Genus:sp11, and so forth.

Selected labels can be deleted

This is a feature people have been asking for for some time, so it is good to finally get it into the system.

If you have a data set with a variety of labels in it then you can select everything you don't want to keep and then delete them.  Simple as that.

There are two deletion approaches.  The default is to also delete any groups which have no remaining labels after the label deletions are completed; this is consistent with the Run Exclusions dialogue (look under the Basedata menu).  The other approach is to keep these groups.  This provides a convenient way of generating empty groups, as one can import a dummy label to create the relevant groups, and then delete the label while retaining the groups.

Any deleted groups are plotted in light grey to show where they were.  If groups are not deleted then they are not plotted in grey, as they are still part of the Basedata.

The key point to be aware of is that there is currently no undo support, so be careful when you do this.  While you do get a warning message allowing you to time change your mind, it is probably worth working on a copy of your Basedata just to be on the safe side. 

The other point is that deletions will not be applied to Basedatas which contain analysis outputs, e.g. Spatial, Cluster or RegionGrower analyses.  The system will throw an error if you try.  At the time of writing it waits until you try to delete the labels before complaining, but future versions might simply make the menu option insensitive (non-clickable).  The reason we don't at the moment is that we need to track additions and deletions of outputs in a Basedata when a view labels tab is open to make it work smoothly.

One can delete selected labels.  In this case, all labels in Tasmania have been selected.
Groups deleted because all their labels have been deleted are plotted in grey.  This helps keep track of where the deletions have occurred.    In this example Tasmania is now plotted as grey, but so are several other groups which contained only labels found in Tasmania.


New Basedata objects can be created from the selected labels

Sometimes you don't want to delete any labels, as you still need them for analyses.  In this case you can create a new Basedata object from the selected labels.

There is not much to say about this one.  All it does is create a new Basedata object where only the selected labels are used.  The groups in the new basedata will be only those which contain the selected labels by default, but there is the option to retain all groups.  

There is also the option to use only the non-selected labels.  You could achieve the same result by switching the selection before exporting selected records, but this way you can avoid a few button clicks.



Selected labels can be exported

In the same way that labels and groups can be exported using the Export menu, the selected labels and the groups which contain them can be directly exported using any of the supported formats.

This is another case of saving button clicks, as one could otherwise create a new Basedata and then export that, but that would become irritating if one needed to do it frequently.  (This is actually what the system does in the background, as it creates temporary Basedata object, exports it, and then discards it.  Consequently it might not work well for very large Basedatas if system memory is in short supply). 

All the usual export options are available, but they will apply only to the selected set. 

Summary

To sum up, these additions represent a very useful set of features which allow the user finer control over the labels that are selected, visualised, and now exported or cleaned up.

Please give them a try and report any success or issues.  You can use the comment below, the mailing list or the issue tracker



Shawn Laffan, 15-Dec-2014




For more details about Biodiverse, see http://purl.org/biodiverse

For the full list of changes in the 0.99 series (leading to version 1) see https://purl.org/biodiverse/wiki/ReleaseNotes#version-099

To see what else Biodiverse has been used for, see https://purl.org/biodiverse/wiki/PublicationsList

You can also join the Biodiverse-users mailing list at http://groups.google.com/group/Biodiverse-users