Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
The
technological advancements of data storage capacity and computational
capabilities has implications for the recording of time series data with
increasingly narrow intervals, called high-frequency time series data. Sensor
data, as a prominent example of high-frequency time series generated through
the utilization of the Internet of Things (IoT), is susceptible to issues
related to missing data due to the likelihood of device failures. Furthermore,
both the quantity and quality of data significantly impact the performance of
forecasting models. This study examines the effects of imputing missing data
within a forecasting workflow for sensor data that records water levels at four
observation sites. The analysis will be conducted by evaluating the forecasting
outcomes of the IMV-LSTM (Interpretable Multi Variable Long Short-Term Memory)
model, trained using data reconstructed through imputation methods. The results
indicate that the imputed data using the Kalman-Structural method enhances
forecast accuracy, evidenced by a 32% reduction in RMSE compared to the model
trained on data without imputation treatment as the benchmark. Additionally,
imputed data employing Kalman ARIMA improves the performance of the IMV-LSTM
model, yielding a 29% lower RMSE compared to the benchmark. The best-performing
model demonstrates that the forecasts of water levels deviate by only
approximately 0.1% from the actual data.
Country : Indonesia
IRJIET, Volume 9, Issue 6, June 2025 pp. 142-148