CIVE 445 - ENGINEERING HYDROLOGY

CHAPTER 7: REGIONAL ANALYSIS

  • Regional analysis encompasses the study of hydrologic phenomena with the aim of developing mathematical relations to be used in a regional context.

  • Mathematical relations are developed so that information from long-record catchments can be transferred to neighboring ungaged or short-record catchments of similar hydrologic characteristics.

  • Other applications include regression techniques.

7.1  JOINT PROBABILITY DISTRIBUTIONS

  • Probability distributions with two random variables, X and Y, are called bivariate, or joint distributions.

  • A joint distribution expresses in mathematical terms the probability of occurrence of an outcome consisting of a pair of values X and Y.

  • In statistical notation, P(X = xi, Y = yj) is the probability that X and Y will take the respective outcomes xi and yj simultaneously.

  • A shorter notation is P(xi, yj).

  • The sum of the probabilities of all possible outcomes is equal to unity.

    Σi=1n Σ j=1m P(xi, yj) = 1

  • A classical example of a joint probability is the cast of two dice, say A and B.

  • The probability of getting a 1 for A and a 6 for B is:  P(A= 1, B= 6) = 1/36.

  • This distribution is referred to as bivariate uniform because all outcomes have the same probability (1/36).

  • Joint cumulative probabilities are defined in a similar way:

    F (xk, yl) = Σi=1k Σ j=1l P(xi, yj)

  • The probability of A being less than or equal to 5 and B being less than or equal to 3 is 5 × 3 = 15 combinations; i.e., 15/36.
 

Marginal Probability Distributions

  • Marginal probability distributions are obtained by summing up the joint probability distribution over all values of one of the variables, say x.

  • The resulting marginal distribution is the probability of the other variable, in this case y, without regard to x.

  • The marginal distribution of X is:

    P (xi) = Σj=1m P(xi, yj)

  • Likewise, the marginal distribution of X is:

    P (yj) = Σi=1n P(xi, yj)

  • The probability of A being 1 regardless of the value of B (marginal probability) is 1/6.

  • Likewise, the probability of B being 4 regardless of the value of A (marginal probability) is 1/6.

  • Note that the joint probabilities (1/36) of six possible outcomes have been summed to obtain the marginal probability.

  • Marginal cumulative probability distributions are obtained by combining the concepts of marginal and cumulative distributions.

  • The marginal cumulative probability distribution of X is:

    F (xk) = Σi=1k Σ j=1m P(xi, yj)

  • Likewise, the marginal cumulative probability distribution of Y is:

    F (yl) = Σi=1n Σ j=1l P(xi, yj)

  • The probability of A being less than or equal to 2, regardless of the value of B, is 2/6 = 1/3.

  • Likewise, the probability of B being less than or equal to 5, regardless of the value of A, is 5/6.
 

Conditional Probability

  • Conditional probability is useful in regression analysis.

  • The conditional probability is the ratio of joint and marginal distributions.

  • The conditional probability of x, given y, is:

    P(x|y) = P(x, y) / P(y)

  • The conditional probability of y, given x, is:

    P(y|x) = P(x, y) / P(x)

  • Joint probability is the product of conditional and marginal probabilities.

  • Joint probability distributions can be expressed as continuous functions.

  • In this case, they are called joint density functions.

  • As with univariate distributions, joint density functions have moments.

  • The joint moment of order r and s about the origin (indicated with ') is defined as follows:

    μ'r,s = ∫ ∫ xr ys f(x,y) dy dx

  • With r= 1 and s= 0, it reduces to the mean of x:

    μ'1,0 = ∫ x [ ∫ f(x,y) dy ] dx

  • The expression within brackets is the marginal PDF of x, or f(x):

    μ'1,0 = μx = ∫ x f(x) dx

  • With r= 0 and s= 1, it reduces to the mean of y:

    μ'0,1 = ∫ y [ ∫ f(x,y) dx ] dy

  • The expression within brackets is the marginal PDF of y, or f(y):

    μ'0,1 = μy = ∫ y f(y) dy

  • The second moments are usually written about the mean:

    μr,s = ∫ ∫ (x - μx)r (y - μy)s f(x,y) dy dx

  • For r= 2 and s= 0, the second moment reduces to the variance of x.

    σ2x = ∫ (x - μx)2 f(x) dx

  • Likewise, for r= 0 and s= 2, the second moment reduce to the variance of y.

    σ2y = ∫ (y - μy)2 f(y) dy

  • For r= 1 and s= 1, the second moment reduces to the covariance:

    σx,y = ∫ ∫ (x - μx)(y - μy) f(x,y) dy dx

  • The correlation coefficient relates the covariance σx,y and standard deviations σx and σy. The population correlation coefficient is:

    ρx,y = σx,y / (σxσy)

  • The sample correlation coefficient is:

    rx,y = sx,y / (sxsy)

  • The correlation coefficient is a measure of the linear dependence between x and y.

  • It varies in the range -1 and +1.

  • A value of ρ close to 1 indicates a strong linear dependence between the variables.

  • A values of ρ close to or equal to -1 indicates a correlation such that large values of x are associated with small values of y, and vice versa.

  • A value of ρ = 0, i.e., a zero covariance, indicates a lack of a linear dependence between x and y.

  • Example 7-1.

  • Example 7-1 Solution.

  • Bivariate Normal Distribution.

7.2  REGRESSION ANALYSIS

  • A fundamental tool of regional analysis is the equation relating two or more hydrologic variables.

  • The variable for which values are given is called the predictor variable.

  • The variable for which values must be estimated is called the criterion variable.

  • Correlation provides a measure of the goodness of fit of the regression.

  • Regression provides the parameters; correlation describes its quality.

  • The principle of least squares is used to obtain the best estimates of the parameters of the prediction equation.

  • It is based on the minimization of the sum of the square of the differences between observed and predicted values.  

     

One-predictor-variable regression
  • Assume a predictor variable x, a criterion variable y, and a set of paired observations or x and y.

  • The line to be fitted has the form:

    y' = α + β x

  • in which y' is an estimate of y, and α and β are parameters to be fitted by the regression.

  • Values of α and β are sought such that y' is the best estimate of y.

  • For this purpose, the sum of the square of the differences between y and y' are minimized.

    Σ (y - y')2 = Σ [y - (α + β x)]2

  • The partial derivative with respect to α is set to zero:

    ∂{Σ [y - (α + β x)]2}/∂α = 0

  • Likewise, the partial derivative with respect to β is set to zero:

    ∂{Σ [y - (α + β x)]2}/∂β = 0

  • This leads to the normal equations:

    Σy - nα - β Σx = 0

    Σxy - α Σx - β Σ x2 = 0

  • Solving these two equations simultaneously leads to:

    β = [Σ(xy) - (ΣxΣy/n)] / [Σx2 - (Σx)2/n]

    α = (Σy - βΣx)/n

  • The slope of the regression line is: β = ρ (σyx) Bivariate Normal Distribution.

  • Therefore, the correlation coefficient is: ρ = β (σxy)

  • The estimate of the correlation coefficient from sample data is: r = β (sx/sy)

  • The standard error of estimate of the correlation se is the square root of the variance of the conditional distribution, estimated as follows:

    se = {[1/(n - 2)] Σ (y - y')2}1/2

  • The standard error of estimate can also be estimated from the variance of the conditional distribution as follows (Eq. 7-23):

    se = sy {[(n - 1)/(n - 2)] (1 - r2)}1/2

  • The regression equations can be used to fit power functions of the type: y = axb.

  • This equation is linearized to: log y = log a + b log x.

  • With u = log x; and v = log y; the equation is: v = log a + bu.

  • Variables u and v are used in lieu of x and y, respectively.

  • Then α = log a; and β = b.

  • The regression equation is: y =10α xβ

  • Example 7-2.

  • Example 7-2 Solution.

  • Example 7-2 Solution b.

  • Multiple Regression.

  • Multiple Regression b.

7.3  REGIONAL ANALYSIS OF FLOOD AND RAINFALL CHARACTERISTICS

  • A fundamental approach to regionalization of hydrologic properties was to assume that peak flow is related to catchment area:

    Qp = c Am

  • Because of runoff diffusion, the exponent m is always less than 1, usually in the range 0.4-0.9.

  • Other formulas are the following:

    Qp = c AnA-m

    Qp = c A(a - b log A)

    Qp = [c A /(a + bA)m] + dA

  • The Creager curves are an example of the second formula: see Creager curves.

  • These equations do not explicitly account for flood frequency.
 

  Rainfall Intensity-Duration-Frequency

  • IDF curves are required for peak flow computations in small catchments.

  • The procedure to develop an IDF curve is illustrated by the following example:  Example 7-4.

 

Go to Chapter 8.

 
060407