Wednesday, February 5, 2014

Raising A White Flag, paleotree v2.0 and the Reciprocal Monophyly of Various Hemichordate Groups

Hello all!

We have a lot to talk about today.

Last time, I bemoaned users who were misunderstanding my functions. Well, I have surrendered. timePaleoPhy is now happy to take taxon dates as min-max bounds. In the newest version of paleotree, version 2.0, you can find this functionality in the new argument dateTreatment for timePaleoPhy and other time-scaling functions.

In fact, a lot has changed in paleotree lately. Last night, I uploaded paleotree 2.0 to CRAN, and it should be making its way to your favorite repository as a binary of your choice soon. This, and the previous update, paleotree 1.9, have changed a lot about paleotree!

All these changes are kinda fitting, given that paleotree is now version 2.0, and the turning over of a new version number usually a rapid change in the structure of a package.  However don't really try to read too much into that, its really just a happy accident. There's a lot of very extreme opinions some people have about version numbering and I was a little naive when I started programming. So, way back in January of 2012, I just decided to go with a simple system where every public release except the most minor would be given a new increment of 0.1, and the first public release would be 1.0. I'm sure some people think that's a hideous way to do version numbering, but whatever. It ain't your package, dudes.

As always, you can look at paleotree's CHANGELOG to see what's new, but here's the last two entries for your reading pleasure.


Version - 1.9 - 01-02-14
-Changed all functions to individual R script files rather than single master script. Why didn't I do this years ago??!

-All help files converted to an roxygen2 format within function scripts

-At suggestion of Fabricio Villalobos, added option to expandTaxonTree that allows branch lengths to be retained and the added lower-taxa are connected with zero-length branches. Only useful for very specific purposes, please use with caution.

-Changed all lines which checked for class "phylo" of input objects to use is(obj,"phylo") instead of class(obj)=="phylo" per recommendation of Carl Boettiger
-Added new warning line to taxicDivDisc so people who try to pass it matrices with character strings in the matrix will get more helpful warning messages

-Upgraded probAnc and qSProb2comp based on equations in Foote (1996) for all three modes of origination, also fixed some errors in previous versions

-Added new function pqr2Ps which uses Emily King's exact derivation for the joint probability of a clade being (a) going extinct but sampled on an infinite time scale and (b) never going extinct on an infinite time scale

-Removed internal Ps function from cal3, now pqr2Ps which is exported directly to namespace

-Converted to new way to handle likelihood functions in paleotree, moving to a function-as-an-object system like diversitree.

-Following on last point, added new functions for fitting models of duration frequency data, replacing getSampProbDisc and getSampRateCont

-More models to follow in future versions of paleotree: added new function footeValues which will (eventually) support a release of a function that implements Foote's (2001, 2003, 2005) inverse survivorship models

-Modified diversitree's 'constrain' function to make a paleotree version named constrainParPaleo which is both entirely separate and fulfills needs such as constraining many similarly named parameters to a single value

-See new functions listed in ?modelMethods for manipulating functions in the new model format

-Added some not-exported hidden functions for use by the various model-handling functions

-Added a 'terrible idea' function optimPaleo which simplifies using optim with new parameter bounds functions. This function is entirely for pedagogical reasons and may be removed later.

-Added new function horizonSampRate which uses ML estimator from Solow and Smith (1997) for estimating sampling rate from precise durations in continuous time and number of sampled horizons

-Added new function perCapitaRates which estimates the per-capita origination and extinction rates from discrete interval data, following the methods from Foote (2000)

Version - 2.0 - 02-03-14
-Changed parInit to use uniform distribution to randomly draw initial parameter values between bounds, rather than take mid value between bounds

-Changed how time-scaling functions dealt with node.mins argument; can no longer use node.mins with a dataset that has unshared taxa that are to be dropped

-Altered example for use of node.mins in help files for time-scaling functions; thanks John Clarke of Oxford for the heads up! Also other modifications were made to the help descriptions, clarifying that node.mins can be used to constrain the minimum age for the root node.

-Also on a different issue brought to my attention by John Clarke, added an error message to paleotree when 'equal' is attempted by the edges leading to the root are zero-length (because 'equal' cannot run under this situation!)

-Added new function collapseNodes that collapses specific user-defined nodes, either forward or backward

-Made all lines checking for dichotomous trees check both with is.binary.tree() *and* is.rooted()

-Added new function dateNodes which returns the dates of the internal and tip nodes of a phylogeny on an absolute time-scale, with respect to the $root.time element if one exists

-On a trial basis, I have added new function inverseSurv which attempts to replicate the forward and inverse survivorship modeling applied by Foote (2001, 2003, 2005) and is useable with the newly implemented constrainParPaleo framework implemented in the previous version of paleotree. I am not yet convinced this function is a 100% faithful replicate of the original method.
-fixed error in plotTraitgram where if trait data was entered in same order as tree$tip.label, trait data was not resorted prior to running ace

-'equal' method in timePaleoPhy wasn't returning same result as 'equal' method in DatePhylo from Graeme Lloyd's original code. This turned out to be a result of differences in how we ordered nodes: Graeme ordered them by time or distance-from-the-root (using dist.nodes) and I was using node.depth, which counts number of branching events. This choice shouldn't make a considerable difference on the performance of the algorithm, but does produce some differences in the resulting time-scaled trees. For consistency, I have change timePaleoPhy to match Graeme's algorithm.

-altered timePaleoPhy and cal3timePaleoPhy to allow the point date occurrences with the first and second column of timeData interpreted as bounds instead, using the argument dateTreatment="minMax"

-related to above change, the argument rand.obs was removed from timePaleoPhy and cal3TimePaleoPhy as no longer necessary, this functionality is now available via dateTreatment="randObs"

-Although it may be strange to not report a lack of a change, but still have not added the finite time window approach for durationFreq


Hahah, and as you might guess from that last one, I've still got some more new things and changes for paleotree in store in the next few months. I included it here because I was actually partway through adding this option and had to undo those additions through commenting, since I wanted to push this version to CRAM (the undoing worked, I think, but I can't be certain, so best to add it to the CHANGELOG). 

As always, let me know what you think of the new paleotree functionality!

So what else?

Well, hemichordate phylogeny has been shaken back and forth a little lately. You might have missed them, so here's a short list:

The Stach article in particular is covered by Cambrian Mammal's blog:

The Cannon et al. is really neat: with greater gene and taxon coverage than a few years ago, it looks like the different hemichordate groups really are reciprocally monophyletic and not nested, and maybe even the Rhabdopleura and Cepholdiscus groups are even reciprocally monophyletic. Big implications for what the stem deuterostome looked like... and even bigger implications for the stem graptolite.... if you're into that sort of thing. ;)

Oh, and finally, I'm also a brachiopod worker now! I've begun a remote post-doc with Sandy Carlson at UC Davis. I'm looking forward to doing some neat things with her and the rest of her lab!