Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Fig. 5. Spatial distribution of vertical permeability values in the MOSYS V1 modelling system mesh in the clustering experiment territory in Latvia. Screenshot fragment of MOSYS visualization platform HiFiGeo, log scale, EPSG:25884 projection.
300000 350000 400000 450000 500000 550000 600000 650000 700000 750000
6180000
6200000
6220000
6240000
6260000
6280000
6300000
6320000
6340000
6360000
6380000
6400000
6420000
6440000
5E-008
1E-007
5E-007
1E-006
5E-006
1E-005
5E-005
0.0001
0.0005
0.001
0.005
0.01
0.05
0.1
0.5
1
5
10
PermeabilityZ [m
/day]
IntroductionIntroductionThe cover of Quaternary sediments, especially in formerly
glaciated territories usually is the most complex part of the sedimentary sequences. In regional hydro-geological models it is often assumed as a single layer with uniform or calibrated properties (Valner, 2003). However, the properties and structure of Quaternary sediments control the groundwater recharge: it can either direct the groundwater flow horizontally towards discharge in topographic lows or vertically, recharging groundwater in the bedrock. Building a representative structure of Quaternary strata in a regional hydrogeological model is important as this affects all the underlying strata.
Scope and objectiveScope and objectiveThis work aims to present calibration results and detail
our experience while integrating a scalable generalization of hydraulic conductivity for Quaternary strata into the regional groundwater modelling system for the Baltic artesian basin – MOSYS V1.
MethodsMethodsIn this study the main unit of generalization is the spatial
cluster and the main criteria for finding adequate representative borehole columns is the Normalized Copression Distance metric – a nonparametric datamining technique calculated by CompLearn ncd utility (Cilibrasi, 2007).
Spatial clusters were obtained from flattened complete-link hierarchical clustering dendrogram representations, supplying cluster membership identifiers for each borehole. Using Voronoi tesselation and GIS dissolve spatial analysis functionality on this data spatial clusters were obtained in GIS format. The procedure is described in detail in “Lithological Uncertainty Expressed by Normalized Compression Distance” by Jatnieks et al. 2012. Full overview of the experiment workflow with software components and key stages is shown in Figure 1.
We used two experiment matrices for hierarchical clustering. One where spatial clusters are obtained from Normalized Compression Distance (further identified as NCD matrix) matrix alone and a second where matrix is combined with range normalized Euclidean distance matrix of borehole locations in BalticTM93 projection (further identified as NCD+E experiment matrix).
INVESTING IN YOUR FUTURE
This work was supported by the European Social Fund project "Establishment of interdisciplinary scientist group and
modelling system for ground-water research", 2009/0212/1DP/1.1.1.2.0/09/APIA/VIAA/060
Fig. 6. Middle fragment of vertical crossection AB (overview in Figure 5) across the Vidzeme region, showing horizontal permeability values on the right Y axis in Quaternary represented by 8 modelling system layers (only Q layers shown, for simplicity). Left Y axis - Z values in meters above sea level. Distance from crossection A point on X axis.
Fig. 9. Squared error distribution by model calibration layers for each of the geometries derived from different logical clustering solutions using NCD matrix.
12 13 14 15 17 16 23 50 51 52 53 55 54 22 56 60 61 57 58 20 21 59 48 49 64 9 63 62 6 11 39 19 38 18 37 10 70 71 75 74 73 72 67 47 65 66 46 5 36 35 69 68 24 34 33 32 4 45 77 76 43 42 29 41 28 25 26 40 27 30 8 31 44 85 84 7 82 80 81 83 78 79
0
100
200
300
400
500
600
700
Squared error sum distribution by clustering solution and associated experimental geometry structure - NCD matrix
D2nr D2ar D2br D3gj D3am D3pl D3slp D3dg D3ogkt D3stel D3jnsk C1 P Q
Number of logical clusters after dendrogram flattening
Squa
red
erro
r sum
in m
eter
s by
mod
el c
alib
ratio
n la
yers
(sm
alle
r is
bette
r)
Fig. 11. Squared error distribution by model calibration layers for each of the geometries derived from different logical clustering solutions using NCD+E matrix.
70 69 67 68 42 43 51 50 47 49 46 45 44 81 82 48 66 57 52 55 56 53 62 54 61 64 63 85 83 84 60 65 59 58 78 77 79 80 72 71 73 74 76 37 40 41 39 38 33 32 31 30 29 27 28 35 34 36 26 25 24 23 22 20 21 17 18 19 16 11 14 12 13 15 10 9 8 5 6 4 7
0
200
400
600
800
1000
1200
1400
Squared error sum distribution by clustering solution and associated experimental geometry structure - NCD+E matrix
D2nr D2ar D2br D3gj D3am D3pl D3slp D3dg D3ogkt D3stel D3jnsk C1 P Q
Squa
red
erro
r sum
in m
eter
s by
mod
el c
alib
ratio
n la
yers
(sm
alle
r is
bette
r)
Number of logical clusters after dendrogram flattening
ReferencesReferences1. Bennett, C. H., Gacs P., Li M., Vitanyi P., Zurek W. 1998, Information Distance, IEEE Transactions on Information Theory,
44(4), 1407-1423., IEEE.2. Cilibrasi, R. 2007., Statistical Inference Through Data Compression, ILLC Dissertation Series DS-2007-01, Institute for
Logic, Language and Computation, Universiteit van Amsterdam.3. Klints I., Virbulis J., Timuhins A., Sennikovs J. and Bethers U., Calibration of the hydrogeological model of the Baltic
Artesian Basin, EGU Geophysical research abstracts EGU2012-10003 [this section poster board A255].4. Takčidi, E. 1999. Datu bāzes "Urbumi" dokumentācija [Documentation of the database "Boreholes"]. Valsts ģeoloģijas
dienests, Rīga. [In Latvian].5. Valner, L. 2003. Hydrogeological model of Estonia and its applications. Proc. Estonian Acad. Sci. Geol., 52, 3, 179-192.6. Virbulis, J., Bethers, U., Saks, T., Sennikovs, J. & Timuhins, A. Hydrogeological model of the Baltic Artesian Basin.
Hydrogeology Journal. [Under revision].7. C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound
constrained optimization (1997), ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
ResultsResultsThe experiment geometries based on NCD matrix perform
better overall.The various Quaternary representations tested have a
measurable effect on model results - the worst geometries, such as NCD+E 7, 3 and others with low logical cluster count, perform more than twice as bad compared to the best NCD clustering based geometries.
NCD clustering based geometries with the better initial calculation results (in Figures 8 and 10), such as the 12 and 13 cluster variants, also tend to respond better to optimization with MOSYS auto-calibration routine (Figure 12).
Some the geometries with lower results from intial model run, show very minimal improvement with calibration (Figure 12).
For our experiment territory, consisting of formerly glaciated territory of Latvia, the best performing generalizations have 12 and 13 clusters.
This could have been successfully deduced from the clustering dendrogram in Figure 2.
The best performing model geometry before applying the NCD metric based clustering solution, had squared error sum of 814 meters. The best NCD clustering geometry provides 9.7% improvement to 735. This may seem modest in nominal terms. It isn't, because the experimental Quaternary strata were inserted in a previously calibrated model geometry with initial conductivity values as estimates. Model autocalibration works by multiplying layer material properties with coefficients determined by the the Scipy L-BFGS-B routine (Zhu et al. 1997). Since this works on a layer basis, the proportional differences in conductivity values between elements in each layer remain constant. Ideally, the model geometry area built using clustering and the rest of the territory would be calibrated seperately. This could yield further improvements.
A
B
Fig. 3. Algorithm for calculating Z values for including spatial clustering results into the MOSYS finite-element mesh geometry and resolving structure conflicts, arising from different lithological structures, geological columns selected from the most typical borehole log in this cluster.
380000 390000 400000 410000 420000 430000 4400006355000
6360000
6365000
6370000
6375000
6380000
6385000
6390000
6395000
6400000
6405000
5E-008
1E-007
5E-007
1E-006
5E-006
1E-005
5E-005
0.0001
0.0005
0.001
0.005
0.01
0.05
0.1
0.5
1
5
10
50
100
PermeabilityXY [m
/day]
Fig. 4. After the algorithm described in Fig. 3 creates the layer structure into the model geometry, the respective conductivity values are assigned to mesh elements, created by the new placeholder layers. The cluster membership for each of mesh elements is determined by intersecting the element centroids, shown here in white with spatial clusters in GIS Shapefile layer. Log scale for conductivity values on Y axis.
Fig. 2. Complete-link agglomerative hierarchical clustering dendrogram for Normalized Compression Distance matrix of serialized borehole log lithological structures.
SCALABLE GENERALIZATION OF HYDRAULIC CONDUCTIVITY IN QUATERNARY SCALABLE GENERALIZATION OF HYDRAULIC CONDUCTIVITY IN QUATERNARY STRATA FOR USE IN A REGIONAL GROUNDWATER MODELSTRATA FOR USE IN A REGIONAL GROUNDWATER MODEL
J nis J TNIEKSā ĀJ nis J TNIEKSā Ā 11, Konr ds POPOVSā, Konr ds POPOVSā 11, Ilze KLINTS, Ilze KLINTS22, Andrejs TIMUHINS, Andrejs TIMUHINS33, Andis KALV NSĀ, Andis KALV NSĀ 11, Aija D LI AĒ Ņ, Aija D LI AĒ Ņ 11, Tomas SAKS, Tomas SAKS1111Faculty of Geography and Earth Sciences, Faculty of Geography and Earth Sciences, 22Faculty of Physics and Mathematics, Faculty of Physics and Mathematics, 33Laboratory for Mathematical Modelling of Environmental and Technological Processes. Laboratory for Mathematical Modelling of Environmental and Technological Processes.
University of Latvia, Riga, Latvia. Contact: University of Latvia, Riga, Latvia. Contact: [email protected]@lu.lv, , www.puma.lu.lvwww.puma.lu.lv
Class Lithology
Hori-zontal
conduc-tivity, m/day
Vertical conduc-
tivity, m/day
1 sandstone 5 1
2sandstone-
low conductivity
1 0.5
3sandstone-
high conductivity
50 20
4 limestone 30 10
5 limestone-low conductivity 1 0.1
6 dolomite-high conductivity 100 30
7 dolomite 30 10
8domerite (clayey
dolomite)0.1 0.01
9 clay 0.00001 0.0000110 loam 0.001 0.00111 sandy loam 0.001 0.00112 silt 0.01 0.0113 sand 5 514 gravel 20 2015 gypsum 10 1016 peat 10 10
17gyttja and
other biogenic sediments
0.0003 0.0003
18 soil 1 119 other 1 120 concrete 0 0
Table 1. Initial conductivity values from lithology classifier, used for generalization of borehole log structures for this study.
This degree of similarity and the spatial heterogeneity of the cluster polygons can be varied by different flattening of the hierarchical cluster model into variable number of clusters. Such an approach provides a scalable generalization solution. It can be scaled from broad to more detailed generalization, depending on model and structural characteristics of the modelling territory, with the help of the clustering dendrogram (Figure 2).
Using the dissimilarity matrix of the NCD metric, a borehole, most similar to all the others from the lithological structure point of view, can be identified from the subset of all cluster member boreholes. The log structure of this borehole then is applied throughout the territory of the corresponding spatial cluster. The spatial locations of the same logical clusters can be disjoint - in different parts of the modelling territory and have the same borehole log column, representing the lithological structure throughout spatial clusters with the same logical cluster identifier.
162 model geometries were prepared, incorporating different candidate representations of Quaternary strata made from spatial clusters with the most typical borehole column, selected using NCD metric. The borehole log column lithological structure is composed using aggregate lithology types as shown in Table 1 with their conductivity values.
To integrate these results into the geometry of regional groundwater model MOSYS V1, the algorithm, shown in Figure 3, calculates mesh point heights in meters above sea level for every mesh point in every newly created Quaternary layer.
After the layer surface calculations, the mesh element cluster memebership is determined by intersecting element centroids (white dots in Figure 4) with the spatial clusters.
The performance of resulting geometries is detailed in Figures 8-11. For both matrix variants (NCD and NCD+E) 82 model geometries, based on different flattenings of cluster dendrogram (Figure 2) were prepared and run in MOSYS V1 modelling environment. From the inital model run results in Figures 8 and 10, 3 better perfoming, one median and also the worst performing geometry was run through MOSYS autocalibration routine (Klints et al. 2012).
This provided insight in their sensitivity to calibration through further optimization of hydraulic conductivity values in Quaternary strata (Figure 12). The model geometry, in which the Quaternary clustering experiment geometries were integrated, had already been calibrated until convergence before these experiements.
Fig. 7. Previous implementation of Quaternary strata in the MOSYS V1 modelling system consisted of 4 layers - two sets of aquifer-aquitard transitions. The arrow indicates the border between the previous Quaternary implementation (left) and the new Quaternary implementation, using hierarhical clustering of Normalized Compression Distanece, on the right.
Fig. 12. Selected experimental model geometries and their response to calibration of model Quaternary strata.
The response of the selected new model geometries to calibration indicates their influence on the underlying model strata, not just improved performance in the part of the model representing the Quaternary deposits.
Geometries NCD 12c, NCD 13c were calibrated further, including the remaining bedrock layers as optimization parameters and yielding further decrease of squared error sum, used as overall fitness function for model performance against observed pressure heads.
Examples of crossections from NCD 12c structure are shown in Figures 6 and 7.
Fig. 1. Overview of the full experiment workflow for using Normalized Compression Distance and hierarchical clustering as an approch for scalable generalization of Quaternary sediment lithological structure in the MOSYS regional groundwater modelling system for the Baltic artesian basin territory.
12 13 14 15 17 16 23 50 51 52 53 55 54 22 56 60 61 57 58 20 21 59 48 49 64 9 63 62 6 11 39 19 38 18 37 10 70 71 75 74 73 72 67 47 65 66 46 5 36 35 69 68 24 34 33 32 4 45 77 76 43 42 29 41 28 25 26 40 27 30 8 31 44 85 84 7 82 80 81 83 78 79
1050
1060
1070
1080
1090
1100
1110
1120
1130
1140Number of logical clusters against squared sum of head errors - NCD matrix
Number of logical clusters after dendrogram flattening
Squa
red
erro
r sum
in
met
ers
(sm
alle
r is
bette
r)
Fig. 8. Squared sum of errors against geometries prepared from different clustering solutions using NCD matrix.
70 69 67 68 42 43 51 50 47 49 46 45 44 81 82 48 66 57 52 55 56 53 62 54 61 64 63 85 83 84 60 65 59 58 78 77 79 80 72 71 73 74 76 37 40 41 39 38 33 32 31 30 29 27 28 35 34 36 26 25 24 23 22 20 21 17 18 19 16 11 14 12 13 15 10 9 8 5 6 4 7
105011001150120012501300135014001450150015501600165017001750
Number of logical clusters against squared sum of head errors - NCD+E matrix
Number of logical clusters after dendrogram flattening
Squa
red
erro
r sum
in
met
ers
(sm
alle
r is
bette
r)
Fig. 10. Squared sum of errors against geometries prepared from different clustering solutions using NCD+E matrix.
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
193
199
205
211
217
223
229
235
241
247
253
259
265
271
277
283
289
295
301
307
313
319
325
331
337
343
349
355
361
367
373
379
385
391
850
900
950
1000
1050
1100
1150
Target function values against optimization iteration count
Optimization runs for Quaternary strata in 5 model geometries for each of NCD and NCD + E matrices
NCD 12c
NCD 13c
NCD 14c
NCD 74c
NCD 79c
NCD+E 7c
NCD+E 67c
NCD+E 69c
NCD+E 70c
NCD+E 73c
Number of iterations for MOSYS auto-calibration routine
Squa
red
erro
r sum
in
met
ers
(sm
alle
r is
bette
r)
mailto:[email protected]://www.puma.lu.lv/
Slide 1