12 data sets were studied. [I] A data set on Mushrooms, the property of the Swiss Federal Institute for Forest, Snow and Landscape Research WSL, managed by Simon Egli. Data sets on [II] Fish and [III] Crustaceans, the property of Pisces Conservation Ltd., managed by Peter A. Henderson (see also). Fish and Crustaceans were enumerated from the same physical samples. We also studied the ‘whole’ samples: [IV] Fish+Crustaceans. We consider this an integration of 2 (sub) assemblages into a (new) assemblage (see our Supplement A). [V] A data set on tropical rainforest Trees from the Smithsonian Tropical Research Institute’s Center for Tropical Forest Science, managed by Condit et al. (see also). We also used data sets on four different desert assemblages of [VI] Rodents, [VII] Winter annuals, [VIII] Summer annuals and [IX] Ant colonies in the Chihuahuan desert, near Portal, Arizona. [X] A data set on weed Seedlings managed by the Centre for Ecology and Hydrology. [XI] A data set on Brachiopod fossils obtained from Thomas D. Olszewski. He re-enumerated material that had been deposited at the National Museum of Natural History, Washington DC. The material was sampled from Permian deposits spanning a period of approx. 10 Myr in a mountain range of approx. 40 km. The set of 187 samples was presented as consisting of 4 composite assemblages representing four geological formations. We consider the data as 1 composite set on our account. [XII] A data set on cow patty Flies. Characteristics of the data sets are given in the supplemental table. The sets, IV and XI excepted, were collected and studied previously for a characteristic of SADs as histograms, with data binned into frequency classes. Some additional information on the data sets can be found there.
Most sets have samples that were collected in different years (Mushrooms, Fish, Crustaceans, Rodents, Winter and Summer Annuals, Ants, Flies). Within-years sampling was done in different weeks (Mushrooms), in different months (Fish, Crustaceans), or at different locations (Rodents, Winter and Summer Annuals, Ants, Flies). Thus, samples can be assigned to subsets (terminology of set theory: the many samples are objects that form different subsets that form the set (Wikipedia headword ‘Set theory’)). In the other sets (Trees, Seedlings, Brachiopods) a similar structure can be applied. Within the subsets and the set, the samples can be merged, abundances adding up over species, forming composited ‘samples’. We studied (i) samples, (ii) composite samples of subsets and (iii) composite samples of sets, representing 3 scales of integration. Total abundance and species richness values, n and S, of samples and of composite samples of subsets were rank-transformed. The ranks over both parameters were averaged and their median was used to select ‘average’ samples among the primary and the composited samples of subsets, for the figure.
For the stretched exponential, we followed. The equation is y = (b+a×ln(x))^(1/c), with y for abundance and x for rank (rank 1 assigned to highest abundance value). It has three parameters: a, b, and c. The function can be rewritten to y^c = b+a×ln(x). This linear function can be used in simple fitting, using least squares. First, in an iterative process, the correlation between ln(x) and y^c is maximized by varying c, resulting in the best fitting value for c. Additionally, a linear regression is performed of y^c on ln(x), resulting in fitted values for a and b. For what they call the intuitive interpretation of the three parameters a, b and c, we refer to.