Spatial Stream Network Analysis of Skeena Watershed Stream Temperatures 2025
The suggested citation for this analytic appendix is:
Hill, N.E., Thorley, J.L., & Irvine, A. (2025) Spatial Stream Network Analysis of Skeena Watershed Stream Temperatures 2025. A Poisson Consulting Analytic Appendix. URL: https://www.poissonconsulting.ca/f/1130667589.
Background
The primary goal of the current analyses is to answer the following questions:
How can we model stream temperature to include spatial correlation through a stream network, and predict growing season degree days?
Data Preparation
Hourly water temperature data collected in the Skeena watershed in northern British Columbia between 2015 and 2025 were downloaded from water-temp-bc, which stores a combination of historic and recent real-time hydrometric data from Environment and Climate Change Canada.
Hourly air temperature data (at two metres above ground level) for the Skeena watershed for the years 2015-2025 were downloaded from the ERA-5-Land simulation using the Copernicus Climate Change Service (C3S) Climate Data Store as NetCDF files (Muñoz Sabater 2019).
The data were extracted, cleaned (i.e., checked for errors and corrected if possible), and tidied (i.e., manipulated into a consistent format) using using R version 4.5.1 (R Core Team 2022).
Key assumptions of the data preparation included:
- For each site, the time series of stream temperature values were classified as “unreasonable” if they were \(<\) 0˚C or \(>\) 30˚C, changed more than 2˚C from one hour to the next, were adjacent to another unreasonable value, or within a 5-hour gap between two unreasonable value; these were excluded from the analysis.
- The simulated air temperature from the modeled simulation is a reasonable approximation of the truth.
- Discharge data were not included in this analysis because Pacific Climate Impacts Consortium’s Gridded Hydrologic Model Output does not currently have output for the Skeena watershed.
- Seven sites with few observed data points proved insufficient to capture the annual-scale fluctuations in stream temperature; these sites (station numbers 08EC014, 08ED004, 08EE008, 08EE025, 08EF001, 08EG012, and 08EG018) were excluded from the analysis.
Statistical Analysis
Model parameters were estimated using Bayesian methods. Where sufficient information was available, causal parameters were estimated accounting for biases due to correlation by embedding the assumed causal pathways into the model in the form of a causal network (Arif and MacNeil 2023) which is represented visually as a directed acyclic graph (DAG). The estimates were produced using Stan (Carpenter et al. 2017). For additional information on Bayesian estimation the reader is referred to McElreath (2020).
Unless stated otherwise, the Bayesian analyses used weakly informative prior distributions (Gelman et al. 2017). The posterior distributions were estimated from 1500 Markov Chain Monte Carlo (MCMC) samples thinned from the second halves of 3 chains (Kery and Schaub 2011, 38–40). Model convergence was confirmed by ensuring that the potential scale reduction factor \(\hat{R} \leq 1.05\) (Kery and Schaub 2011, 40) and the effective sample size (Brooks et al. 2011) \(\textrm{ESS} \geq 150\) for each of the monitored parameters (Kery and Schaub 2011, 61).
Model adequacy was assessed via posterior predictive checks (Kery and Schaub 2011). More specifically, the proportion of zeros in the data and the first four central moments (mean, variance, skewness, and kurtosis) of the deviance residuals were compared to the expected values by simulating new data based on the posterior distribution and assumed sampling distribution and calculating the deviance residuals.
The sensitivity of the posteriors to the choice of prior distributions was evaluated by separately power-scaling the priors and likelihood, estimating the properties of the perturbed posteriors using Pareto smoothed importance sampling (Vehtari et al. 2017), and evaluating the distance between the base and perturbed posteriors using the cumulative Jensen-Shannon (CJS) distance (Kallioinen et al. 2023). A threshold CJS distance of 0.05 was used to indicate sensitivity; a prior sensitivity above this value suggests a strong prior, and a likelihood sensitivity below this value suggests that the likelihood is non-informative (Kallioinen et al. 2023). For hierarchical effects, only the top-level parameter was power-scaled to avoid perturbing the priors multiple times (Kallioinen et al. 2023).
The parameters are summarized in terms of the point estimate, lower and upper 95% compatibility limits (Rafi and Greenland 2020), and the surprisal s-value (Greenland 2019). Together a pair of lower and upper compatibility limits (CLs) are referred to as a compatibility interval (CI). The estimate is the median (50th percentile) of the MCMC samples while the 95% CLs are the 2.5th and 97.5th percentiles. The s-value indicates how surprising it would be to discover that the true value of the parameter is in the opposite direction to the estimate (Greenland 2019). An s-value of \(>\) 4.32 bits, which is equivalent to a p-value \(<\) 0.05 (Kery and Schaub 2011; Greenland and Poole 2013), indicates that the surprise would be equivalent to throwing at least 4.3 heads in a row on a fair coin.
Variable selection was based on the heuristic of directional certainty (Kery and Schaub 2011; Murtaugh 2014; Castilho and Prado 2021). Fixed effects were included if their s-value was \(>\) 4.32 bits (Kery and Schaub 2011). Based on a similar argument, random effects were included if their standard deviation had a lower 95% CL \(>\) 5% of the median estimate.
The results are displayed graphically by plotting the modeled relationships between individual variables and the response with the remaining variables held constant. In general, continuous and discrete fixed variables are held constant at their arithmetic mean and first level values, respectively, while random effects are held constant at their typical value (Kery and Schaub 2011, 77–82). Unless stated otherwise the typical value is the arithmetic mean. When informative the influence of a particular variable is expressed in terms of the effect size (i.e., relative change in the response variable) with the 95% CI (Bradford et al. 2005).
The analyses were implemented using R version 4.5.1
(R Core Team 2022) and the
embr family of packages.
Model Descriptions
Stream Temperature
The data were analyzed using a Spatial Stream Network model
(Ver Hoef and Peterson 2010; Peterson and Hoef 2010), with code adapted
from the SSNbayes
package (Santos-Fernandez et al. 2022). The necessary stream network
distances and connectivity were calculated using the BC Freshwater Atlas
and the SSNbler
(Peterson et al. 2024) and SSN2
(Dumelle et al. 2024) R packages. Air and stream temperature data were
averaged by site and week; modeling was done on this weekly time scale.
The expected stream temperatures were modeled using the 3-parameter version of the air2stream model (Toffolon and Piccolroaz 2015). The average stream temperature (in ˚C) in the first week, \(W_{s,j=1}\) was estimated by the model and assumed to be the same for all sites.
For all subsequent weeks (i.e., \(j > 1\)), the change in the stream temperature (in ˚C) between week \(j - 1\) and week \(j\) for the \(s^{th}\) site, \(\Delta W_{s,j}\), was modeled as follows:
\[\begin{equation} \Delta W_{s,j} = a1_s + a2_s A_{s,j} - a3_s W_{s,j - 1}, \end{equation}\]
where \(a1_s\), \(a2_s\), and \(a3_s\) are the parameters of the air2stream model for the \(s^{th}\) site, \(A_{s,j}\) is the air temperature (in ˚C) for the \(s^{th}\) site in the \(j^{th}\) week and \(W_{s, j - 1}\) is the expected stream temperature at the \(s^{th}\) site in the previous week.
The expected stream temperature for the \(s^{th}\) site in the \(j^{th}\) week was then calculated:
\[\begin{equation} W_{s,j} = W_{s,j - 1} + \Delta W_{s,j}. \end{equation}\]
Growing Season Degree Days (GSDD) are the accumulated thermal units (in ˚C) during the growing season based on the mean daily water temperature values, which is a useful predictor of age-0 rainbow and westslope cutthroat trout size at the beginning of winter. The start and end of the growing season were based on the definitions of Coleman and Fausch (2007):
- Start: the beginning of the first week that average stream temperatures exceeded and remained above 5˚C for the season.
- End: the last day of the first week that average stream temperature dropped below 4˚C.
GSDD were derived for each site and year by assuming that the daily stream temperatures at each site were the predicted weekly mean stream temperature for every day in the given week.
Key assumptions of the model include:
- The stream network is dendritic, not braided.
- The expected stream temperatures were set to 0˚C if they were estimated to be negative by the model.
- The stream temperature in the first week is the same for all sites.
- The parameters of the air2stream model (\(a1\), \(a2\), and \(a3\)) vary randomly by site.
- The residual variation is multivariate normally distributed.
- The covariance structure of the residual variation combines the
following covariance components:
- Nugget (allows for variation at a single location)
- Exponential tail-down (allows for spatial dependence between flow-connected and flow-unconnected locations)
Preliminary analysis found that:
- Allowing the initial stream temperature to vary randomly by site produced unrealistic stream temperatures for January (> 10˚C).
Model Templates
Stream Temperature
data {
int nsite;
int nweek;
int <lower=0> N_y_obs; // number observed values
int <lower=0> N_y_mis; // number missing values
array[N_y_obs] int<lower=1> i_y_obs;
array[N_y_mis] int<lower=1> i_y_mis;
vector [N_y_obs] y_obs; // matrix[N_y_obs,1] y_obs[T];
array[nsite * nweek] real air_temp;
array[nsite * nweek] int<lower=0> site;
array[nsite * nweek] int<lower=0> week;
matrix [nsite, nsite] D;
matrix [nsite, nsite] I;
matrix [nsite, nsite] H;
matrix [nsite, nsite] flow_con_mat;
parameters {
vector<lower=0, upper=30>[N_y_mis] y_mis; // declaring the missing y
real<lower=0> sigma_nug; // sd of nugget effect
real<lower=0> sigma_td; // sd of tail-down
real<lower=0> alpha_td; // range of the tail-down model
real<lower=0> bInitialTemp;
real<lower=0> s1;
real<lower=0> s2;
real<lower=0> s3;
real<lower=-5, upper=15> m1;
real<lower=-5, upper=1.5> m2;
real<lower=-5, upper=5> m3;
array[nsite] real<lower=-5, upper=15> a1;
array[nsite] real<lower=-5, upper=1.5> a2;
array[nsite] real<lower=-5, upper=5> a3;
transformed parameters {
vector[nsite * nweek] y; // long vector of y
array[nweek] vector[nsite] Y;
matrix[nsite, nsite] C_td; // tail-down cov
real <lower=0> var_nug; // nugget
real <lower=0> var_td; // partial sill tail-down
vector[nsite * nweek] eTempDiff;
vector<lower=0, upper=30>[nsite * nweek] eTemp;
array[nweek] vector[nsite] mu;
y[i_y_obs] = y_obs;
y[i_y_mis] = y_mis;
var_nug = sigma_nug^2; // variance nugget
var_td = sigma_td^2; // variance tail-down
// Place observations into matrices
for (t in 1:nweek){
Y[t] = y[((t - 1) * nsite + 1):(t * nsite)];
}
eTemp[1:nsite] = rep_vector(bInitialTemp, nsite);
for (i in (nsite + 1):(nweek * nsite)) {
eTempDiff[i] = (a1[site[i]] + a2[site[i]] * air_temp[i] - a3[site[i]] * eTemp[i - nsite]);
eTemp[i] = eTemp[i - nsite] + eTempDiff[i];
if (eTemp[i] < 0) {
eTemp[i] = 0.0;
}
}
// Define 1st mu
mu[1] = eTemp[1:nsite];
// Define rest of mu; ----
for (t in 2:nweek){
mu[t] = eTemp[((t - 1) * nsite + 1):(t * nsite)];
}
// Covariance matrices ----
// Tail-down exponential model
for (i in 1:nsite) {
for (j in 1:nsite) {
if (flow_con_mat[i, j] == 1) { // if points are flow connected
C_td[i, j] = var_td * exp(-3 * H[i, j] / alpha_td);
}
else{ // if points are flow unconnected
C_td[i, j] = var_td * exp(-3 * (D[i, j] + D[j, i]) / alpha_td);
}
}
}
model {
sigma_nug ~ exponential(0.05); // sd nugget
sigma_td ~ exponential(2); // sd tail-down
alpha_td ~ normal(20000, 20000); // range tail-down
bInitialTemp ~ normal(1, 1);
s1 ~ exponential(1);
s2 ~ exponential(1);
s3 ~ exponential(1);
m1 ~ normal(0.8, 1);
m2 ~ normal(0.4, 1);
m3 ~ normal(0.4, 1);
a1 ~ normal(m1, s1);
a2 ~ normal(m2, s2);
a3 ~ normal(m3, s3);
for (t in 1:nweek) {
target += multi_normal_cholesky_lpdf(Y[t] | mu[t], cholesky_decompose(C_td + var_nug * I + 1e-6));
}
generated quantities {
array[nweek * nsite] real ePrediction;
array[nweek] real log_lik;
real lprior;
for (t in 1:nweek) {
log_lik[t] = multi_normal_cholesky_lpdf(Y[t] | mu[t], cholesky_decompose(C_td + var_nug * I + 1e-6));
ePrediction[((t - 1) * nsite + 1):(t * nsite)] = to_array_1d(multi_normal_cholesky_rng(mu[t], cholesky_decompose(C_td + var_nug * I + 1e-6)));
}
lprior = exponential_lpdf(sigma_nug | 0.05) +
exponential_lpdf(sigma_td | 2) +
(normal_lpdf(alpha_td | 20000, 20000) - normal_lccdf(0 | 20000, 20000)) + // truncated normal
(normal_lpdf(bInitialTemp | 1, 1) - normal_lccdf(0 | 1, 1)) + // truncated normal
normal_lpdf(m1 | 0.8, 1) +
normal_lpdf(m2 | 0.4, 1) +
normal_lpdf(m3 | 0.4, 1) +
exponential_lpdf(s1 | 1) +
exponential_lpdf(s2 | 1) +
exponential_lpdf(s3 | 1);
Block 1. Model description.
Results
Tables
Locations
Table 1. Stream temperature site locations.
| site | station name | longitude | latitude |
|---|---|---|---|
| 08EB003 | SKEENA RIVER AT GLEN VOWELL | -127.673 | 55.3011 |
| 08EB004 | KISPIOX RIVER NEAR HAZELTON | -127.716 | 55.4338 |
| 08EB005 | SKEENA RIVER ABOVE BABINE RIVER | -127.687 | 55.7171 |
| 08EB007 | KITWANGA RIVER NEAR KITWANGA | -128.054 | 55.1164 |
| 08EC001 | BABINE RIVER AT BABINE | -126.63 | 55.3225 |
| 08EC004 | PINKUT CREEK NEAR TINTAGEL | -125.43 | 54.4054 |
| 08EC013 | BABINE RIVER AT OUTLET OF NILKITKWA LAKE | -126.698 | 55.4265 |
| 08ED001 | NANIKA RIVER AT OUTLET OF KIDPRICE LAKE | -127.452 | 53.9303 |
| 08ED002 | MORICE RIVER NEAR HOUSTON | -127.427 | 54.1168 |
| 08EE003 | BULKLEY RIVER NEAR HOUSTON | -126.719 | 54.3994 |
| 08EE004 | BULKLEY RIVER AT QUICK | -126.9 | 54.6186 |
| 08EE005 | BULKLEY RIVER NEAR SMITHERS | -127.133 | 54.7697 |
| 08EE012 | SIMPSON CREEK AT THE MOUTH | -127.204 | 54.8099 |
| 08EE013 | BUCK CREEK AT THE MOUTH | -126.65 | 54.3961 |
| 08EE020 | TELKWA RIVER BELOW TSAI CREEK | -127.497 | 54.6074 |
| 08EF005 | ZYMOETZ RIVER ABOVE O.K. CREEK | -128.325 | 54.4936 |
| 08EG019 | KITSUMKALUM RIVER BELOW ALICE CREEK | -128.744 | 54.6793 |
Stream Temperature
Table 2. Parameter descriptions.
| Parameter | Description |
|---|---|
C_td |
Covariance matrix of the tail-down exponential model |
D |
Downstream hydrologic distance matrix |
H |
Total hydrologic distance matrix |
I |
The identity matrix |
N_y_mis |
Number of missing water temperature values |
N_y_obs |
Number of observed water temperature values |
Y[t] |
Vector of water temperature values for all sites in the
tth week |
a1[i] |
Intercept-type parameter of the air2stream model for the
ith site |
a2[i] |
Effect of air_temp[i] on eTempDiff[i] for the ith site |
a3[i] |
Effect of the previous week’s expected water temperature
(eTemp[i - nsite]) on eTempDiff[i], for the ith site |
air_temp[i] |
The ith air temperature value (˚C) |
alpha_td |
The variance of spatially independent points |
bInitialTemp |
Expected average water temperature for the week starting 01-01-2015 for all sites |
eTempDiff[i] |
Expected difference in average water temperature from the previous week |
eTemp[i] |
Expected value of water_temp[i] |
flow_con_mat |
Site connectivity matrix |
i_y_mis |
Indexes of missing water temperature values |
i_y_obs |
Indexes of observed water temperature values |
m1 |
Mean of the site-wise random effect for the a1 parameter |
m2 |
Mean of the site-wise random effect for the a2 parameter |
m3 |
Mean of the site-wise random effect for the a3 parameter |
mu[t] |
Vector of eTemp values for all sites in the tth week |
nsite |
Number of sites |
nweek |
Number of weeks |
s1 |
Standard deviation of the site-wise random effect for the a1
parameter |
s2 |
Standard deviation of the site-wise random effect for the a2
parameter |
s3 |
Standard deviation of the site-wise random effect for the a3
parameter |
sigma_nug |
Standard deviation of the nugget effect |
sigma_td |
Standard deviation of the exponential tail-down covariance model |
site[i] |
The ith site |
var_nug |
Variance of the nugget effect |
var_td |
Variance of the exponential tail-down covariance model |
week[i] |
The ith week |
y[i] |
The ith water temperature value (˚C) |
y_mis |
Vector of missing water temperature values |
y_obs |
Vector of observed water temperature values |
Table 3. Model coefficients.
| term | estimate | lower | upper | svalue |
|---|---|---|---|---|
| alpha_td | 1.93e+04 | 1.03e+04 | 4.35e+04 | 10.6 |
| bInitialTemp | 1.68 | 3.33e-01 | 2.78 | 10.6 |
| m1 | 9.40e-01 | 7.51e-01 | 1.13 | 10.6 |
| m2 | 2.60e-01 | 2.01e-01 | 3.24e-01 | 10.6 |
| m3 | 3.19e-01 | 2.65e-01 | 3.77e-01 | 10.6 |
| s1 | 3.56e-01 | 2.57e-01 | 5.23e-01 | 10.6 |
| s2 | 1.29e-01 | 9.34e-02 | 1.91e-01 | 10.6 |
| s3 | 1.10e-01 | 7.93e-02 | 1.63e-01 | 10.6 |
| sigma_nug | 8.06e-01 | 7.87e-01 | 8.24e-01 | 10.6 |
| sigma_td | 8.35e-01 | 7.79e-01 | 8.95e-01 | 10.6 |
Table 4. Model convergence.
| n | K | nchains | niters | nthin | ess | rhat | converged | num_divergent | max_treedepth | ebfmi |
|---|---|---|---|---|---|---|---|---|---|---|
| 9401 | 3802 | 3 | 500 | 2 | 482 | 1.014 | TRUE | 0 | 0 | 1.13 |
Table 5. Model sensitivity.
| information | prior_cjs | lik_cjs | nterms | insensitive |
|---|---|---|---|---|
| strong prior | 0.088 | 0.611 | 2 | FALSE |
| weak prior & informative data | 0.032 | 0.224 | 13250 | TRUE |
Table 6. Model sensitivity by parameter.
| parameter | information | prior_cjs | lik_cjs | nterms | insensitive |
|---|---|---|---|---|---|
| a1 | weak prior & informative data | 0.006 | 0.295 | 17 | TRUE |
| a2 | weak prior & informative data | 0.007 | 0.281 | 17 | TRUE |
| a3 | weak prior & informative data | 0.008 | 0.292 | 17 | TRUE |
| alpha_td | strong prior | 0.078 | 1.016 | 1 | FALSE |
| bInitialTemp | strong prior | 0.088 | 0.611 | 1 | FALSE |
| ePrediction | weak prior & informative data | 0.032 | 0.224 | 9401 | TRUE |
| m1 | weak prior & informative data | 0.006 | 0.34 | 1 | TRUE |
| m2 | weak prior & informative data | 0.006 | 0.431 | 1 | TRUE |
| m3 | weak prior & informative data | 0.006 | 0.598 | 1 | TRUE |
| s1 | weak prior & informative data | 0.017 | 0.665 | 1 | TRUE |
| s2 | weak prior & informative data | 0.007 | 0.822 | 1 | TRUE |
| s3 | weak prior & informative data | 0.013 | 0.558 | 1 | TRUE |
| sigma_nug | weak prior & informative data | 0.012 | 4.76 | 1 | TRUE |
| sigma_td | weak prior & informative data | 0.006 | 0.846 | 1 | TRUE |
| y_mis | weak prior & informative data | 0.029 | 0.239 | 3792 | TRUE |
Table 7. Mean GSDD by year (with 95% prediction intervals).
| year | estimate | lower | upper |
|---|---|---|---|
| 2015 | 2031 | 1944 | 2117 |
| 2016 | 2120 | 2034 | 2211 |
| 2017 | 1906 | 1829 | 1992 |
| 2018 | 2044 | 1958 | 2125 |
| 2019 | 2042 | 1962 | 2137 |
| 2020 | 1815 | 1741 | 1896 |
| 2021 | 1925 | 1847 | 2016 |
| 2022 | 1984 | 1910 | 2060 |
| 2023 | 2233 | 2157 | 2315 |
| 2024 | 2006 | 1922 | 2093 |
Table 8. Mean GSDD by site (with 95% prediction intervals).
| site | estimate | lower | upper |
|---|---|---|---|
| 08EE020 | 1029 | 964 | 1098 |
| 08EE012 | 1365 | 1310 | 1418 |
| 08EB005 | 1653 | 1605 | 1706 |
| 08EF005 | 1743 | 1683 | 1804 |
| 08EB003 | 1869 | 1817 | 1921 |
| 08EB007 | 1910 | 1841 | 1978 |
| 08ED001 | 1937 | 1884 | 1987 |
| 08EE013 | 2030 | 1983 | 2077 |
| 08EG019 | 2056 | 1992 | 2119 |
| 08EB004 | 2062 | 2011 | 2113 |
| 08ED002 | 2139 | 2079 | 2200 |
| 08EE005 | 2309 | 2251 | 2367 |
| 08EE003 | 2312 | 2265 | 2359 |
| 08EC001 | 2330 | 2272 | 2385 |
| 08EE004 | 2364 | 2313 | 2416 |
| 08EC013 | 2449 | 2401 | 2499 |
| 08EC004 | 2639 | 2588 | 2688 |
Figures
Stream Temperature
Acknowledgements
The organizations and individuals whose contributions have made this analytic appendix possible include:
- Hillcrest Geographics
- Simon Norris
- Poisson Consulting
- Priscilla Durojaiye