PROYECTO FINAL¶
CURSO INTRODUCCIÓN A PYTHON PARA CIENCIA DE DATOS
Participante: Mónica Retamosa Izaguirre
Introducción¶
El pájaro campana (Procnias tricarunculatus) es una especie de ave paseriforme de la familia Cotingidae. Es nativo de América Central. Se distribuye por las tierras altas al este de Honduras, noroeste de Nicaragua, Costa Rica y Panamá. En los inviernos habita en las tierras bajas adyacentes. Es un migrante altitudinal que realiza migraciones complejas (Powell & Bjork, 2004). En la época reproductiva vive en los niveles alto y medio del bosque húmedo de montaña, entre los 1200 y 2100 m de altitud. Fuera de la época reproductiva puede ascender hasta los 3000 msnm y descender a las tierras bajas a 600 m de altitud (Brant et al. 2020).
Esta especie está catalogada por la lista roja de la UICN como vulnerable, con su población total estimada entre 3600 y 14 000 individuos maduros, considerada en rápida decadencia debido a la pérdida de hábitat y su degradación (BirdLife International 2021). Por lo tanto, es importante conocer su distribuición y abundancia, para tomar decisiones de conservación bien informadas.
Objetivo del proyecto¶
Analizar la distribución de la especie pájaro campana (Procnias tricarunculatus) en Costa Rica.
Métodología¶
1- Descarga de los datos del pájaro campana de GBIF
2- Depuración datos:
- Eliminación columnas que no aportan información
- Identificación de datos faltantes
- Identificación de registros duplicados
3- Análisis exploratorio de datos:
- Ocurrencia del pájaro campana por provincia de Costa Rica
- Ocurrencia del pájaro campana en Costa Rica por año
4.- Mapeo de distribución de la especie
- Descarga de datos de GADM
- Clasificación de Costa Rica en provincias
- Conversión de observaciones a puntos
- Mapeo de observaciones sobre el mapa de provincias de Costa Rica
Aplicación de metodología y resultados¶
1.- Descarga de datos GBIF
Procedencia de los datos: Los datos de presencia del pájaro campana proceden de los registros almacenados en la plataforma del Fondo Mundial de Información sobre Biodiversidad (GBIF.org, 2023)
2.- Depuración de datos
Importación de librerías y carga de datos
# Importar librerías
!pip install pandas
!pip install ydata-profiling
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (2.0.3) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2023.4) Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2024.1) Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas) (1.25.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0) Collecting ydata-profiling Downloading ydata_profiling-4.8.3-py2.py3-none-any.whl (359 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 359.5/359.5 kB 8.3 MB/s eta 0:00:00 Requirement already satisfied: scipy<1.14,>=1.4.1 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (1.11.4) Requirement already satisfied: pandas!=1.4.0,<3,>1.1 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (2.0.3) Requirement already satisfied: matplotlib<3.9,>=3.2 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (3.7.1) Requirement already satisfied: pydantic>=2 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (2.7.2) Requirement already satisfied: PyYAML<6.1,>=5.0.0 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (6.0.1) Requirement already satisfied: jinja2<3.2,>=2.11.1 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (3.1.4) Collecting visions[type_image_path]<0.7.7,>=0.7.5 (from ydata-profiling) Downloading visions-0.7.6-py3-none-any.whl (104 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.8/104.8 kB 13.0 MB/s eta 0:00:00 Requirement already satisfied: numpy<2,>=1.16.0 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (1.25.2) Collecting htmlmin==0.1.12 (from ydata-profiling) Downloading htmlmin-0.1.12.tar.gz (19 kB) Preparing metadata (setup.py) ... done Collecting phik<0.13,>=0.11.1 (from ydata-profiling) Downloading phik-0.12.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (686 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 686.1/686.1 kB 26.2 MB/s eta 0:00:00 Requirement already satisfied: requests<3,>=2.24.0 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (2.31.0) Requirement already satisfied: tqdm<5,>=4.48.2 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (4.66.4) Requirement already satisfied: seaborn<0.14,>=0.10.1 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (0.13.1) Collecting multimethod<2,>=1.4 (from ydata-profiling) Downloading multimethod-1.11.2-py3-none-any.whl (10 kB) Requirement already satisfied: statsmodels<1,>=0.13.2 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (0.14.2) Collecting typeguard<5,>=3 (from ydata-profiling) Downloading typeguard-4.3.0-py3-none-any.whl (35 kB) Collecting imagehash==4.3.1 (from ydata-profiling) Downloading ImageHash-4.3.1-py2.py3-none-any.whl (296 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 296.5/296.5 kB 25.4 MB/s eta 0:00:00 Requirement already satisfied: wordcloud>=1.9.1 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (1.9.3) Collecting dacite>=1.8 (from ydata-profiling) Downloading dacite-1.8.1-py3-none-any.whl (14 kB) Requirement already satisfied: numba<1,>=0.56.0 in /usr/local/lib/python3.10/dist-packages (from ydata-profiling) (0.58.1) Requirement already satisfied: PyWavelets in /usr/local/lib/python3.10/dist-packages (from imagehash==4.3.1->ydata-profiling) (1.6.0) Requirement already satisfied: pillow in /usr/local/lib/python3.10/dist-packages (from imagehash==4.3.1->ydata-profiling) (9.4.0) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2<3.2,>=2.11.1->ydata-profiling) (2.1.5) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (1.2.1) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (4.52.4) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (1.4.5) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (24.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.9,>=3.2->ydata-profiling) (2.8.2) Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in /usr/local/lib/python3.10/dist-packages (from numba<1,>=0.56.0->ydata-profiling) (0.41.1) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas!=1.4.0,<3,>1.1->ydata-profiling) (2023.4) Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas!=1.4.0,<3,>1.1->ydata-profiling) (2024.1) Requirement already satisfied: joblib>=0.14.1 in /usr/local/lib/python3.10/dist-packages (from phik<0.13,>=0.11.1->ydata-profiling) (1.4.2) Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->ydata-profiling) (0.7.0) Requirement already satisfied: pydantic-core==2.18.3 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->ydata-profiling) (2.18.3) Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->ydata-profiling) (4.12.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.24.0->ydata-profiling) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.24.0->ydata-profiling) (3.7) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.24.0->ydata-profiling) (2.0.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.24.0->ydata-profiling) (2024.2.2) Requirement already satisfied: patsy>=0.5.6 in /usr/local/lib/python3.10/dist-packages (from statsmodels<1,>=0.13.2->ydata-profiling) (0.5.6) Requirement already satisfied: attrs>=19.3.0 in /usr/local/lib/python3.10/dist-packages (from visions[type_image_path]<0.7.7,>=0.7.5->ydata-profiling) (23.2.0) Requirement already satisfied: networkx>=2.4 in /usr/local/lib/python3.10/dist-packages (from visions[type_image_path]<0.7.7,>=0.7.5->ydata-profiling) (3.3) Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from patsy>=0.5.6->statsmodels<1,>=0.13.2->ydata-profiling) (1.16.0) Building wheels for collected packages: htmlmin Building wheel for htmlmin (setup.py) ... done Created wheel for htmlmin: filename=htmlmin-0.1.12-py3-none-any.whl size=27080 sha256=dec717f5ad23ee5b98f48735903afc56b79667365f782f26f2bbc583ed32ef93 Stored in directory: /root/.cache/pip/wheels/dd/91/29/a79cecb328d01739e64017b6fb9a1ab9d8cb1853098ec5966d Successfully built htmlmin Installing collected packages: htmlmin, typeguard, multimethod, dacite, imagehash, visions, phik, ydata-profiling Successfully installed dacite-1.8.1 htmlmin-0.1.12 imagehash-4.3.1 multimethod-1.11.2 phik-0.12.4 typeguard-4.3.0 visions-0.7.6 ydata-profiling-4.8.3
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from ydata_profiling import ProfileReport
# Importar datos como dataframe pandas
df = pd.read_csv('Ocurrencias_pajaro_campana_cr.csv', encoding='latin-1')
<ipython-input-14-05ad2ed9349a>:2: DtypeWarning: Columns (64,65,99,100,101,102,103,104,106,107,108,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,128,129,130,131,132,133,134,135,136,137,138,144) have mixed types. Specify dtype option on import or set low_memory=False. df = pd.read_csv('Ocurrencias_pajaro_campana_cr.csv', encoding='latin-1')
Limpieza y Depuración de datos¶
df.columns
Index(['wkt_geom', 'wkt_geom.1', 'fid', 'key', 'datasetKey', 'publishing', 'installati', 'publishi_1', 'protocol', 'lastCrawle', ... 'higherGe_1', 'locationID', 'associated', 'elevation', 'elevationA', 'depth', 'depthAccur', 'organismQu', 'organism_1', 'Provincia'], dtype='object', length=146)
nombre = "Observaciones pájaro campana en Costa Rica"
profile = ProfileReport(df, title=nombre, explorative=True)
# Mostrar el informe en un notebook (si estás usando Jupyter o similares)
profile.to_notebook_iframe()
Summarize dataset: 0%| | 0/5 [00:00<?, ?it/s]
Generate report structure: 0%| | 0/1 [00:00<?, ?it/s]
Render HTML: 0%| | 0/1 [00:00<?, ?it/s]