Proceedings of the
8th International Symposium on Geotechnical Safety and Risk (ISGSR)
14 – 16 December 2022, Newcastle, Australia
Editors: Jinsong Huang, D.V. Griffiths, Shui-Hua Jiang, Anna Giacomini, Richard Kelly
doi:10.3850/978-981-18-5182-7_02-006-cd
Gaussian Process Regression and Kernel Selection for Missing Geotechnical Data Prediction
Discipline of Civil, Surveying and Environmental Engineering, The University of Newcastle, Callaghan, NSW 2308, Australia.
ABSTRACT
Geotechnical site investigation data (i.e., CPT data) may be missing sometimes, due to, for example, sensor failure or storage issues. Conventionally, missing data is interpolated based on mean value imputation or linear interpolation, in which spatial correlation within the data is ignored. The spatial correlation can be considered explicitly in the geostatistical interpolation methods such as the kriging methods. However, kriging methods involve challenges for estimating the model parameters such as the scale of fluctuations in the covariance model. Gaussian Process Regression (GPR) method infers the model parameters based on maximizing the marginal likelihood. However, kernel selection will largely influence the model performance of the GPR method. This paper aims to compare the performance of nine widely used base kernels. Ninety new kernels based on combination of the base kernels are generated to enhance the adaptability of the GPR model. Four types of stratum with increasingly complex profiles are tested for each kernel based on multiple CPTs. The most suitable kernels for each type of stratum are suggested based on cross-validation with more than one thousand models. The proposed method has been applied to a real-world CPT dataset to show its applicability and robustness.
Keywords: Missing data, Gaussian process regression, spatial correlation, kernel selection