Inverse problem
A starting point in any inverse problem is selection of parameters to be recovered from
available data. We note that if phase change effects are negligibly small then the heat
capacity and thermal conductivity
cannot be resolved separately by
analyzing the observed temperature time series; only the thermal diffusivity
can be found (Hinkel, 1997). Therefore, in order to remove ambiguity in determining the
soil properties, we set the heat capacity
equal to a value that is typical
for a soil type in the
-th layer. Consecutively, we define the control vector
as a set consisting of thermal conductivity
, and
parameters
describing the unfrozen water content for
each soil horizon, or
For each physically realistic control vector , it is possible to compute
temperature dynamics and compare it to the measured data. The available measured data are
organized as two vectors:
and
, associated with
high-resolution-in-time temperature records, and once-a-year measurements in the
borehole, respectively. In this article, we mark quantities related to the active layer
and bore hole measurements by subscripts
and
, respectively. The data are related
to the control vector
by

where is the modeled counterparts of the data, and
is the misfit
vector. In theory, if there are no sampling errors or model inadequacies, the misfit
vector
can be reduced to zero. If there are errors in the data or in the model,
the aim is to minimize
by varying the control vector
.
Assuming that the errors have a Gaussian probability distribution, we define the
cost function
according to (Tikhonov and Leonov, 1996; Beck and Arnold, 1977) as
where ,
are related to model/data misfit, and
is the regularization
term:
Here, the superscript stands for the transposition operator,
and
are covariance matrices for the data
and control vector,
respectively. Finally,
is an initial approximation vector containing a priori estimates of soil properties based on soil texture, mineral composition
and porosity (Nicolsky et al., 2007).
Physically, terms and
describe a mean square deviation of the computed soil
temperatures from the measured ones, while the term
incorporates information about
the soil properties that could be derived from the field/laboratory experiments, or from
existing soil property databases and published data (Kersten, 1949). Typically,
is a diagonal matrix with the expected variances
of each
control vector variable down the main diagonal. For example, if a priori estimates in
are known with large uncertainties, then the expected
variances
are large, and hence contribution of the term
to the
cost function
is small. The opposite is also true. Namely, if the initial
approximation
is known with a good degree of accuracy, then contribution
of
is large and a minimum of
is close to the a priori estimates in
.
Statistical interpretation of the least square method considers the cost function as an
argument of the Gaussian probability distribution (Thacker, 1989; Wunsch, 1996). Under this
statistical interpretation, the optimal control vector
defined by
is the most probable combination of the soil properties for the given temperature realization and prior error statistics. To compute the optimal control vector, we calculate the gradient

of the cost function and minimize cost function by the Quasi-Newton minimization
algorithm (Fletcher, 2000). To calculate the gradient
, we construct and
solve the so-called adjoint model (Jarny et al., 1991; Marchuk, 1995; Wunsch, 1996). The adjoint code is
built analytically by transposition of the operator of the tangent linear model, which
was obtained by direct differentiation of the forward model code, briefly described in
Appendix 5.
In order to estimate the thermal properties uniquely, the boundary conditions must
satisfy some requirements, i.e. be monotonic functions of time (Muzylev, 1985). In
nature, the rapidly changing weather conditions enforce the surface temperature which
does not fulfil this necessary requirement. Consequently, the cost function can have
several minima due to non-linearities in the heat equation (1). Therefore,
the iterative minimization algorithm can converge to an improper minimum if the
iterations are started from an arbitrary selected initial guess .
Nevertheless, with the initial guess
within the basin of attraction of
the global minimum, the iterative optimization method should converge to the proper
minimum even if the model is nonlinear (Thacker, 1989). Usually, the initial guess
is selected to be in the neighborhood of the initial approximation
. Thus, the right choice of the initial approximation
is
important for efficient convergence. In (Nicolsky et al., 2007), we provide a detailed
step-by-step algorithm to compute the initial approximation
.
To obtain physically meaningful results, the covariances matrix has to be
defined appropriately as follows. By definition, if all elements in temperature data
vector are statistically independent, then
is a diagonal
matrix with expected variances
of
down the main
diagonal. Since temperature disturbances at the ground surface decay exponentially with
depth, then the soil temperature does not change significantly over hourly time intervals
at depths more than 0.10 meters (Kudryavtsev, 1978; Carslaw and Jaeger, 1959; Yershov, 1998). Thus, hourly
collected temperature observations are not statistically independent. We note that a
typical time scale over which air temperature changes randomly is approximately ten days
(Blackmon et al., 1984). For the Alaska North Slope Region, these air temperature fluctuations
can penetrate up to the depth of 0.3-0.5 meters and significantly (and randomly) change
soil temperature in the upper meter on a monthly time scale. Hence, the covariance matrix
is approximately a sparse matrix with some diagonal structures, the sizes of
which are equal to the number of temperature measurements collected in one month.
For the sake of simplicity, we approximate
by a diagonal matrix with
down on a main diagonal multiplied by weights. Each weight is such that the
sum of weights corresponding to measurements within each month is equal to one. In our
case, we assume that
are equal uncertainties in temperature measurements due
to the limited precision of sensors.
Considering a variable in the control vector
as
stochastic and non-correlated, it is possible to define errors of the optimally estimated
parameters
. According to Thacker (1989),
the error covariance of the corresponding variables of the control vector can be
calculated as diagonal elements of the inverse Hessian matrix
, where
, and
.
The most simple method to compute
is to approximate it via finite-differences
(Schröter, 1989). Following this approach, we individually perturb the components
of the control vector in the vicinity of the optimal values
. After that, the
finite-difference approximation of the Hessian can be defined as

where is a positive real-valued number,
is
an
-dimensional zero-vector whose
-th element is one, and the quantity
represents the
-th component of cost function's gradient
. The inverse of
the Hessian
can be easily computed since the number
of variables in the
control vector
is typically less than a hundred.