Package water_security

Water Security - Iceland

The current climate change scenario predicts that almost half of the world’s population will live in areas of high water stress by 2050 with limited access to fresh clean water. Governments, national, and international institutions, as well as water management companies, are looking for solutions that can address this growing global water demand. Cities are encouraged to take action on water security, to build resilience to water scarcity and manage this finite resource for the future.

Based on financial, educational, environmental and demographical data, this project aims to display and predict the water security risks around the world. In order to do so, a Regression Machine Learning pipeline is deployed per risk category (e.g. risk of higher water prices or risk of declining water quality), and a forecast of that risk's severity is made. The regression model is based on engineered and selected features from the data mentioned above.

The final dataset is already created, but if needed, can be generated again by running the following notebooks in the given order:

  1. prep_hdro_v2
  2. combine_unlabeled
  3. Dataset Normalization and Imputation
  4. Cities Test Set Processing
  5. Merge Unlabeled to Labeled
  6. CitiesPopulationDensity
  7. Cities Elevation

In order to generate a classification report run the Classification Outcome Visualization.ipynb notebook.

To access the classification specific data and to see the notebook that was used to select the parameters of xgboost. See the classification page.

Expand source code
"""
## Water Security - Iceland
The current climate change scenario predicts that almost half of the world’s population will live in areas of high water stress by 2050 with limited access to fresh clean water. Governments, national, and international institutions, as well as water management companies, are looking for solutions that can address this growing global water demand. Cities are encouraged to take action on water security, to build resilience to water scarcity and manage this finite resource for the future.

Based on financial, educational, environmental and demographical data, this project aims to display and predict the water security risks around the world. In order to do so, a Regression Machine Learning pipeline is deployed per risk category (e.g. risk of higher water prices or risk of declining water quality), and a forecast of that risk's severity is made. The regression model is based on engineered and selected features from the data mentioned above.

**The final dataset is already created, but if needed, can be generated again by running the following notebooks in the given order:**

1. [prep_hdro_v2](notebooks/prep_hdro_v2.html)
2. [combine_unlabeled](notebooks/combine_unlabeled.html)
3. [Dataset Normalization and Imputation](notebooks/Dataset Normalization and Imputation.html)
4. [Cities Test Set Processing](notebooks/Cities Test Set Processing.html)
5. [Merge Unlabeled to Labeled](notebooks/Merge Unlabeled to Labeled.html)
6. [CitiesPopulationDensity](notebooks/CitiesPopulationDensity.html)
7. [Cities Elevation](notebooks/Cities Elevation.html)

In order to generate a classification report run the [Classification Outcome Visualization.ipynb](notebooks/Classification Outcome Visualization.html) notebook.

To access the classification specific data and to see the notebook that was used to select the parameters of xgboost. See the [classification page](classification/index.html).
"""

Sub-modules

water_security.classification

Classification The classification module contains classes for risk prediction …

water_security.labeled_preprocessing

Labeled Preprocessing Here the initial data from cdp.net regarding water security is preprocessed. The merging of the …

water_security.unlabeled_preprocessing

Unlabeled Preprocessing Here, data from various sources are preprocessed to have as much useful data to feed into the model as possible. The …

water_security.utils