This paper has an algorithm that alleviates the computational burden of evaluating summations involving thousands or millions of terms, each of which is statistically variable. It is a simple binning strategy that replaces the large (thousand or million-member) population of terms with a much smaller representative (~10 member) weighted population. This binning method typically gives ~500x computational efficiency boost.

Suppose that the the summation is an average of a sample of many points, the goal is to RETAIN (not minimize) the deviation of a sample’s mean and standard deviation (or other important sample property) in comparison to the same properties of the source distribution. The above animation shows an example of an exponential source distribution in gray, along with a finite discrete sample from it in black. The colored plots are the CDFs of the binned representation, shown in red if the number of bins turns out to be larger than the original population (in which case, the original population is small enough to use directly) or in green if the number of bins is smaller than the original sample size (making binning desireable).

The image above shows the effect of binning a small population, giving both red and green results to show that binning is not particularly attractive for small populations. The key observation in that plot is the fact that the binned representation “tracks with” the location of the black sample CDF. The tag “zooming” indicates that the binning strategy was set to capture the upper part of the CDF with greater accuracy than the lower part:

Below is the same plot for the case of a sample being 4000 points, where it is clear that the black sample is closer to the gray source curve, but still visibly different from it. In this case, the binned representation is consistently green (indicating that the number of bins is smaller than the sample). As seen, the green binned representation again nicely retains the “bouncing” (finite sample error) of the sample, which is important in applications of unstable systems (where the nature of the perturbation controls the nature of the unstable response). Think of the following plot as a mild magnitude-3 earthquake (producing barely noticeable vibrations like bumps when riding in an automobile), while the previous image (above) is like a magnitude-9 quake. The resulting damage induced in unstable systems is, of course, very different. That is why it is so important for a system sample approximation to faithfully retain the original sample’s finite-sampling errors!

Below is the abstract from this paper.

ABSTRACT: A computational scheme is developed for sampling-based evaluation of a function whose inputs are statistically variable. After a general abstract framework is developed, it is applied to initialize and evolve the size and orientation of cracks within a finite domain, such as a finite element or similar subdomain. The finite element is presumed to be too large to explicitly track each of the potentially thousands (or even millions) of individual cracks in the domain. Accordingly, a novel binning scheme is developed that maps the crack data to nodes on a reference grid in probability space. The scheme, which is clearly generalizable to applications involving arbitrary numbers of random variables, is illustrated in the scope of planar deformations of a brittle material containing straight cracks. Assuming two random variables describe each crack, the cracks are assigned uniformly random orientations and non-uniformly random sizes. Their data are mapped to a computationally tractable number of nodes on a grid laid out in the unit square of probability space so that Gauss points on the grid may be used to define an equivalent subpopulation of the cracks. This significantly reduces the computational cost of evaluating ensemble effects of large evolving populations of random variables.

Citation data:

Huq, F., L. Graham-Brady, and R. Brannon (2015) An Efficient Binning Scheme with Application to Statistical Crack Mechanics, Int. J. Num. Meth. Engr. Accepted with Minor Revisions April, 2015.

DOI: 10.1002/nme.4959

Pingback: Undergraduate researcher applies binning to study large groups of statistically variable buckling columns | University of Utah CSM Group