Virtual Beach version 3.0 (VB3.0) has been added to the CEAM site. Virtual Beach facilitates the development of statistical models of pathogen indicator levels at recreational beaches. VB3.0 reads input data from a text or Excel file, assists the user in preparing the data for statistical analysis, and provides three analytical techniques for model development: multiple linear regression (MLR), partial least squares regression (PLS), and a gradient boosting machine (GBM). With an integrated mapping component to determine the geographic orientation of the beach, the software can automatically decompose wind/current speed and direction into along-shore and onshore/offshore components. VB3.0 can produce new variables from sets of variables in the input file (e.g., means, minimums, maximums, differences, sums, products), and it can test an array of transformations on the independent variables to maximize the linearity of the relationship between the response and those independent variables. In the MLR module, automated censoring of models with a high degree of multi-colinearity occurs during the selection process. The PLS and GBM modules institute 5-fold cross-validation during model development to avoid over specification. The prediction module of VB3.0 has a direct link to the USGS EnDDaT system to automatically retrieve data for beach sites in the Great Lakes region.
Keywords: built environment, ecosystem health, land practices, water, beaches, loading, mapping