Getting Shorebird Data in Australia Using Global Biodiversity Data Portals via spocc R Package
Author details: Abhimanyu Raj Singh
Editor details: Xiang Zhao
Contact details: support@ecocommons.org.au
Copyright statement: This script is the product of the EcoCommons platform. Please refer to the EcoCommons website for more details: https://www.ecocommons.org.au/
Date: June 2025
🚀 Before You Start
Ensure you have a stable internet connection, as species data is fetched live from biodiversity data portals using spocc R package.
This notebook may take a few minutes when downloading large datasets or installing missing packages.
Recommended to run in RStudio or Posit Workbench/VS Code with Quarto installed for best experience.
Introduction
This notebook was developed by the EcoCommons team to support researchers and conservationists in gathering shorebird distribution data using open data sources. It demonstrates how to query, clean, visualise, and save species occurrence data using the spocc R package [@spocc] (Chamberlain et al., 2024), in R.
spocc – Species Occurrence Data Aggregator
The spocc package provides a unified interface to search and retrieve species occurrence data from multiple biodiversity data sources into a single workflow. It streamlines data acquisition for ecological modelling and biodiversity analysis by handling different APIs in one place.
We will introduce three of the most popular and largest biodiversity data portals that provide users with the ability to download their data via their platforms or several R packages.
GBIF – Global Biodiversity Information Facility An international network providing open access to data on all types of life on Earth, supporting global biodiversity research and conservation.
iNaturalist – Citizen Science Biodiversity Network A community-driven platform where people record and share species observations, contributing to real-time biodiversity mapping across the globe.
OBIS – Ocean Biodiversity Information System The world’s largest marine biodiversity database, offering access to global data on the distribution of marine life for science, policy, and conservation.
We focus on three key migratory shorebird species, each represented by a distinct colour for clarity in visualisations:
Bar-tailed Godwit (Limosa lapponica) — Green #509E2F
Red-necked Stint (Calidris ruficollis) — Blue #2F6C99
Each year, Australia welcomes millions of migratory species—from delicate shorebirds to majestic seabirds—journeying thousands of kilometres along flyways like the East Asian–Australasian Flyway.
Species such as the Bar-tailed Godwit, Red-necked Stint, and Curlew Sandpiper arrive from as far as Siberia and Alaska, using Australia’s rich wetlands as critical stopovers and overwintering sites. These incredible travellers embody the resilience of nature and highlight the importance of conserving international migratory routes and coastal habitats.
Objectives:
Step
Description
1. Set the Working Directory
Prepare the environment and load necessary R packages.
2. Get Data
Retrieve occurrence data for target shorebird species from GBIF, iNaturalist, and OBIS. Merge all sources together.
3. Data Cleaning and Filtering
Clean the data, remove invalid points, and restrict to Australia’s boundary.
4. Data Visualisation
Visualize cleaned occurrences interactively using Leaflet maps with species-specific colours assigned explicitly to each species.
5. Save Data
Save the final datasets as CSV and Shapefile inside the data/ folder.
In the near future, this material may form part of comprehensive support materials available to EcoCommons users. If you have any corrections or suggestions to improve the efficiency, please contact the EcoCommons team.
Step 1: Set the working directory
Set the working directory and prepare the environment. Install necessary packages if missing and load them.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
Attaching package: 'rnaturalearthdata'
The following object is masked from 'package:rnaturalearth':
countries110
Step 2: Get Data
We define the target species and retrieve occurrence data from GBIF, iNaturalist, and OBIS. We combine all sources together into a single tidy data-frame for further processing.
We clean the data by removing missing coordinates and converting points into spatial format. We then filter records to only those falling within Australia’s political boundaries.
We use Leaflet to plot the cleaned occurrences on an interactive map. Each species is colour-coded by directly mapping species names to specific colours.
We save the final occurrence data-set as both CSV and Shape-file. The files are stored in a “data” folder to support further analysis or GIS integration.
Deleting layer `shorebird_occurrences_AUS' using driver `ESRI Shapefile'
Writing layer `shorebird_occurrences_AUS' to data source
`data/shorebird_occurrences_AUS.shp' using driver `ESRI Shapefile'
Writing 1772 features with 4 fields and geometry type Point.
Chamberlain, S., Barve, V., Mcglinn, D., Oldoni, D., Desmet, P., Geffert, L., & Ram, K. (2024). spocc: Interface to Species Occurrence Data Sources. R package version 1.2.4. Available at: https://CRAN.R-project.org/package=spocc
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L.D., François, R., et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Cheng, J., Karambelkar, B., Xie, Y. (2023). leaflet: Create Interactive Web Maps with the JavaScript ‘Leaflet’ Library. R package version 2.1.2.
Pebesma, E. (2018). Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal, 10(1), 439–446. https://doi.org/10.32614/RJ-2018-009
South, A. (2017). rnaturalearth: World Map Data from Natural Earth. R package version 0.1.0.
South, A. (2017). rnaturalearthdata: World Vector Map Data from Natural Earth Used in ‘rnaturalearth’. R package version 0.1.0.
footer
EcoCommons received investment (https://doi.org/10.3565/chbq-mr75) from the Australian Research Data Commons (ARDC). The ARDC is enabled by the National Collaborative Research Infrastructure Strategy (NCRIS).
Our Partner
How to Cite EcoCommons
If you use EcoCommons in your research, please cite the platform as follows: