#The reticulate package provides a comprehensive set of tools for interoperability between Python and R.
library(reticulate)
Access via API and Python
{afscgap} Library Installation
author: Sam Pottinger (sam.pottinger@berkeley.edu; GitHub::sampottinger) date: May 13, 2023
The third-party afscgap
Python package interfaces with FOSS to access AFSC GAP data. It can be installed via pip:
pip install afscgap+https://github.com/SchmidtDSE/afscgap.git@main pip install git
For more information on installation and deployment, see the library documentation.
Basic query
This first example queries for Pacific glass shrimp (Pasiphaea pacifica) in the Gulf of Alaska in 2021. The library will automatically generate HTTP queries, converting from Python types to ORDS query syntax.
import afscgap
= afscgap.Query()
query query.filter_year(eq=2021)
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
= query.execute() results
The results
variable in this example is an iterator that will automatically perform pagination behind the scenes.
Iterating with a for loop
The easiest way to interact with results is a simple for loop. This next example determines the frequency of different catch per unit effort where Pacific glass shrimp were reported:
import afscgap
# Mapping from CPUE to count
= {}
count_by_cpue
# Build query
= afscgap.Query()
query query.filter_year(eq=2021)
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
= query.execute()
results
# Iterate through results and count
for record in results:
= record.get_cpue_weight(units='kg/ha')
cpue = round(cpue)
cpue_rounded = count_by_cpue.get(cpue_rounded, 0) + 1
count = count
count_by_cpue[cpue_rounded]
# Print the result
print(count_by_cpue)
Note that, in this example, only records with Pacific glass shrimp are included (“presence-only” data). See zero catch inference below. In other words, it reports on CPUE only for hauls in which Pacific glass shrimp were recorded, excluding some hauls like those in which Pacific glass shrimp were not found at all.
Iterating with functional programming
A for loop is not the only option for iterating through results. List comprehensions and other functional programming methods can be used as well.
import statistics
import afscgap
# Build query
= afscgap.Query()
query query.filter_year(eq=2021)
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
= query.execute()
results
# Get temperatures in Celsius
= [record.get_bottom_temperature(units='c') for record in results]
temperatures
# Take the median
print(statistics.median(temperatures))
This example reports the median temperature in Celcius for when Pacific glass shrimp was reported.
Load into Pandas
The results from the afscgap
package are serializable and can be loaded into other tools like Pandas. This example loads Pacific glass shrimp from 2021 Gulf of Alaska into a data frame.
import pandas
import afscgap
= afscgap.Query()
query query.filter_year(eq=2021)
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
= query.execute()
results
pandas.DataFrame(results.to_dicts())
Specifically, to_dicts
provides an iterator over a dictionary form of the data that can be read into tools like Pandas.
Advanced filtering
Queries so far have focused on filters requiring equality but range queries can be built as well.
import afscgap
# Build query
= afscgap.Query()
query query.filter_year(min_val=2015, max_val=2019) # Note min/max_val
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
= query.execute()
results
# Sum weight
= map(lambda x: x.get_weight(units='kg'), results)
weights = sum(weights)
total_weight print(total_weight)
This example queries for Pacific glass shrimp data between 2015 and 2019, summing the total weight caught. Note that most users will likely take advantage of built-in Python to ORDS query generation which dictates how the library communicates with the API service. However, users can provide raw ORDS queries as well using manual filtering.
Zero-catch inference
Until this point, these examples use presence-only data. However, the afscgap
package can infer negative or “zero catch” records as well.
import afscgap
# Mapping from CPUE to count
= {}
count_by_cpue
# Build query
= afscgap.Query()
query query.filter_year(eq=2021)
query.filter_srvy(eq='GOA')
query.filter_scientific_name(eq='Pasiphaea pacifica')
query.set_presence_only(False) # Added to earlier example
= query.execute()
results
# Iterate through results and count
for record in results:
= record.get_cpue_weight(units='kg/ha')
cpue = round(cpue)
cpue_rounded = count_by_cpue.get(cpue_rounded, 0) + 1
count = count
count_by_cpue[cpue_rounded]
# Print the result
print(count_by_cpue)
This example revisits the earlier snippet for CPUE counts but set_presence_only(False)
directs the library to look at additional data on hauls, determining which hauls did not have Pacific glass shrimp. This lets the library return records for hauls in which Pacific glass shrimp were not found. This can be seen in differences in counts reported:
Rounded CPUE | Count with set_presence_only(True) | Count with set_presence_only(False) |
---|---|---|
0 kg/ha | 44 | 521 |
1 kg/ha | 7 | 7 |
2 kg/ha | 1 | 1 |
Put simply, while the earlier example showed CPUE counts for hauls in which Pacific glass shrimp were seen, this revised example reports for all hauls in the Gulf of Alaska in 2021.
More information
Please see the API documentation for the Python library for additional details.