Access API data using R

AKFIN has developed web services (apis) to distribute GAP data. Like the GAP_PRODUCTS schema, these are under active development. These do not require VPN or an oracle connection but they are protected by Oracle authentication, please contact matt.callahan@noaa.gov for information on how to get an api token to use this option.

The url structure is “https://apex.psmfc.org/akfin/data_marts/gap_products/gap_[base table name]” . For example “https://apex.psmfc.org/akfin/data_marts/gap_products/gap_biomass” is the base url to get data from the akfin_biomass table. Web services linked to large tables have mandatory parameters to reduce data download size. For example to get agecomp data for Bering Sea pollock in area_id 10 in 2022 you would use “https://apex.psmfc.org/akfin/data_marts/gap_products/gap_biomass?survey_definition_id=98&area_id=10&species_code=21740&start_year=2022&end_year=2022”.

If you’re using R to pull data through web services you might find the akfingapdata (pronounced akfin-gap-data not ak-eff-ing-app-data) R package helpful.

# load libraries
library(dplyr)
library(magrittr)
library(httr)
library(flextable)

# tell R to not use scientific notation
options(scipen=999)

# function for pulling data from the api using the httr package
get_gap_biomass<-function(area_id, species_code) {
  # paste(... collapse=",") puts commas between vector elements
  area_id <- paste(area_id, collapse = ",")
  species_code <- paste(species_code, collapse = ",")
  # httr code, parameters are after the '?'
  httr::content(
    httr::GET(paste0("https://apex.psmfc.org/akfin/data_marts/akmp/gap_biomass?area_id=",
                     area_id,
                     "&species_code=",
                     species_code)),
    type = "application/json") %>%
    # convert to data frame
    bind_rows()
}

8.1 Ex. Direct database query in R using the (akfingapdata readme)[https://github.com/MattCallahan-NOAA/akfingapdata/blob/main/README.Rmd] R package:

# load packages
library(odbc)
library(getPass)
library(tidyverse)

# connect to AKFIN Oracle database
con <- dbConnect(odbc::odbc(), "akfin", UID=getPass(msg="USER NAME"), PWD=getPass())
# define species code for pollock
my_species <- 21740

#query database
data<- dbFetch(dbSendQuery(con,
                           paste0("select * from gap_products.akfin_biomass 
where species_name = ", my_species, 
" and survey_definition_id = 98, 
and area_id = 10"))) %>%
rename_with(tolower) # everyone likes lower case letters better

head(data)

8.2 Ex. Direct database query in R using the (akfingapdata readme)[https://github.com/MattCallahan-NOAA/akfingapdata/blob/main/README.Rmd] R package:

library(akfingapdata)

# Sign into akfin with token (need to request token from AKFIN)
token <- akfingapdata::create_token(file = paste0(dirname(here::here()), "/akfin_token.txt"))

akfingapdata::get_gap_catch()[,1:6] %>% 
  head() %>% 
  flextable::flextable() %>%
  flextable::theme_zebra()

Ex. 2: Load catch data with {akfingapdata}.

cruisejoin

hauljoin

catchjoin

species_code

weight_kg

count

-611

-13,626

-374,397

21,740

189.590

450

-611

-13,632

-374,643

21,740

134.830

151

-611

-13,501

-370,283

21,740

32.060

44

-611

-13,545

-371,701

21,740

1,154.024

1,345

-611

-13,577

-372,795

21,740

963.070

1,273

-611

-13,677

-376,146

21,740

1,618.589

2,814