CIVE 445 - ENGINEERING HYDROLOGY
CHAPTER 7: REGIONAL ANALYSIS

Regional analysis encompasses the study of hydrologic phenomena with the aim of developing mathematical relations to be used in a regional context.
Mathematical relations are developed so that information from long-record catchments can be transferred to neighboring ungaged or short-record catchments of similar hydrologic characteristics.
Other applications include regression techniques.

7.1 JOINT PROBABILITY DISTRIBUTIONS

Probability distributions with two random variables, X and Y, are called bivariate, or joint distributions.
A joint distribution expresses in mathematical terms the probability of occurrence of an outcome consisting of a pair of values X and Y.
In statistical notation, P(X = x_i, Y = y_j) is the probability that X and Y will take the respective outcomes x_i and y_j simultaneously.
A shorter notation is P(x_i, y_j).
The sum of the probabilities of all possible outcomes is equal to unity.

Σ_i=1ⁿ Σ _j=1^m P(x_i, y_j) = 1

A classical example of a joint probability is the cast of two dice, say A and B.
The probability of getting a 1 for A and a 6 for B is: P(A= 1, B= 6) = 1/36.
This distribution is referred to as bivariate uniform because all outcomes have the same probability (1/36).
Joint cumulative probabilities are defined in a similar way:

F (x_k, y_l) = Σ_i=1^k Σ _j=1^l P(x_i, y_j)

The probability of A being less than or equal to 5 and B being less than or equal to 3 is 5 × 3 = 15 combinations; i.e., 15/36.

Marginal Probability Distributions

Marginal probability distributions are obtained by summing up the joint probability distribution over all values of one of the variables, say x.
The resulting marginal distribution is the probability of the other variable, in this case y, without regard to x.
The marginal distribution of X is:

P (x_i) = Σ_j=1^m P(x_i, y_j)

Likewise, the marginal distribution of X is:

P (y_j) = Σ_i=1ⁿ P(x_i, y_j)

The probability of A being 1 regardless of the value of B (marginal probability) is 1/6.
Likewise, the probability of B being 4 regardless of the value of A (marginal probability) is 1/6.
Note that the joint probabilities (1/36) of six possible outcomes have been summed to obtain the marginal probability.
Marginal cumulative probability distributions are obtained by combining the concepts of marginal and cumulative distributions.
The marginal cumulative probability distribution of X is:

F (x_k) = Σ_i=1^k Σ _j=1^m P(x_i, y_j)

Likewise, the marginal cumulative probability distribution of Y is:

F (y_l) = Σ_i=1ⁿ Σ _j=1^l P(x_i, y_j)

The probability of A being less than or equal to 2, regardless of the value of B, is 2/6 = 1/3.
Likewise, the probability of B being less than or equal to 5, regardless of the value of A, is 5/6.

Conditional Probability

Conditional probability is useful in regression analysis.
The conditional probability is the ratio of joint and marginal distributions.
The conditional probability of x, given y, is:

P(x|y) = P(x, y) / P(y)

The conditional probability of y, given x, is:

P(y|x) = P(x, y) / P(x)

Joint probability is the product of conditional and marginal probabilities.
Joint probability distributions can be expressed as continuous functions.
In this case, they are called joint density functions.
As with univariate distributions, joint density functions have moments.
The joint moment of order r and s about the origin (indicated with ') is defined as follows:

μ'_r,s = ∫ ∫ x^r y^s f(x,y) dy dx

With r= 1 and s= 0, it reduces to the mean of x:

μ'_1,0 = ∫ x [ ∫ f(x,y) dy ] dx

The expression within brackets is the marginal PDF of x, or f(x):

μ'_1,0 = μ_x = ∫ x f(x) dx

With r= 0 and s= 1, it reduces to the mean of y:

μ'_0,1 = ∫ y [ ∫ f(x,y) dx ] dy

The expression within brackets is the marginal PDF of y, or f(y):

μ'_0,1 = μ_y = ∫ y f(y) dy

The second moments are usually written about the mean:

μ_r,s = ∫ ∫ (x - μ_x)^r (y - μ_y)^s f(x,y) dy dx

For r= 2 and s= 0, the second moment reduces to the variance of x.

σ²_x = ∫ (x - μ_x)² f(x) dx

Likewise, for r= 0 and s= 2, the second moment reduce to the variance of y.

σ²_y = ∫ (y - μ_y)² f(y) dy

For r= 1 and s= 1, the second moment reduces to the covariance:

σ_x,y = ∫ ∫ (x - μ_x)(y - μ_y) f(x,y) dy dx

The correlation coefficient relates the covariance σ_x,y and standard deviations σ_x and σ_y. The population correlation coefficient is:

ρ_x,y = σ_x,y / (σ_xσ_y)

The sample correlation coefficient is:

r_x,y = s_x,y / (s_xs_y)

The correlation coefficient is a measure of the linear dependence between x and y.
It varies in the range -1 and +1.
A value of ρ close to 1 indicates a strong linear dependence between the variables.
A values of ρ close to or equal to -1 indicates a correlation such that large values of x are associated with small values of y, and vice versa.
A value of ρ = 0, i.e., a zero covariance, indicates a lack of a linear dependence between x and y.
Example 7-1.
Example 7-1 Solution.
Bivariate Normal Distribution.

7.2 REGRESSION ANALYSIS

A fundamental tool of regional analysis is the equation relating two or more hydrologic variables.
The variable for which values are given is called the predictor variable.
The variable for which values must be estimated is called the criterion variable.
Correlation provides a measure of the goodness of fit of the regression.
Regression provides the parameters; correlation describes its quality.
The principle of least squares is used to obtain the best estimates of the parameters of the prediction equation.
It is based on the minimization of the sum of the square of the differences between observed and predicted values.

One-predictor-variable regression

Assume a predictor variable x, a criterion variable y, and a set of paired observations or x and y.
The line to be fitted has the form:

y' = α + β x

in which y' is an estimate of y, and α and β are parameters to be fitted by the regression.
Values of α and β are sought such that y' is the best estimate of y.
For this purpose, the sum of the square of the differences between y and y' are minimized.

Σ (y - y')² = Σ [y - (α + β x)]²

The partial derivative with respect to α is set to zero:

∂{Σ [y - (α + β x)]²}/∂α = 0

Likewise, the partial derivative with respect to β is set to zero:

∂{Σ [y - (α + β x)]²}/∂β = 0

This leads to the normal equations:

Σy - nα - β Σx = 0

Σxy - α Σx - β Σ x² = 0

Solving these two equations simultaneously leads to:

β = [Σ(xy) - (ΣxΣy/n)] / [Σx² - (Σx)²/n]

α = (Σy - βΣx)/n

The slope of the regression line is: β = ρ (σ_y/σ_x) Bivariate Normal Distribution.
Therefore, the correlation coefficient is: ρ = β (σ_x/σ_y)
The estimate of the correlation coefficient from sample data is: r = β (s_x/s_y)
The standard error of estimate of the correlation s_e is the square root of the variance of the conditional distribution, estimated as follows:

s_e = {[1/(n - 2)] Σ (y - y')²}^1/2

The standard error of estimate can also be estimated from the variance of the conditional distribution as follows (Eq. 7-23):

s_e = s_y {[(n - 1)/(n - 2)] (1 - r²)}^1/2

The regression equations can be used to fit power functions of the type: y = ax^b.
This equation is linearized to: log y = log a + b log x.
With u = log x; and v = log y; the equation is: v = log a + bu.
Variables u and v are used in lieu of x and y, respectively.
Then α = log a; and β = b.
The regression equation is: y =10^α x^β
Example 7-2.
Example 7-2 Solution.
Example 7-2 Solution b.
Multiple Regression.
Multiple Regression b.

7.3 REGIONAL ANALYSIS OF FLOOD AND RAINFALL CHARACTERISTICS

A fundamental approach to regionalization of hydrologic properties was to assume that peak flow is related to catchment area:

Q_p = c A^m

Because of runoff diffusion, the exponent m is always less than 1, usually in the range 0.4-0.9.
Other formulas are the following:

Q_p = c A^{nA^-m}

Q_p = c A^{(a - b log A)}

Q_p = [c A /(a + bA)^m] + dA

The Creager curves are an example of the second formula: see Creager curves.
These equations do not explicitly account for flood frequency.

Rainfall Intensity-Duration-Frequency

IDF curves are required for peak flow computations in small catchments.
The procedure to develop an IDF curve is illustrated by the following example: Example 7-4.

Go to Chapter 8.

060407