GCP Instructions

Authors
Affiliation

Sophia Wassermann

NOAA Fisheres Alaska Fisheries Science Center

Emily Markowitz

NOAA Fisheres Alaska Fisheries Science Center

Introduction

Before beginning this process, please check with Sophia Wassermann, Emily Markowitz, or OFIS to confirm that a computing instance and an Rstudio server have been set up for your username.

Resources:

Set-up and Access

Setting up Google Cloud Provider (GCP)

GCP is accessed through the command line on your local machine. First, you will need to install Google Cloud CLI: “language-specific Cloud Client Libraries” that make it “easier for you to interact with Google Cloud APIs in your language of choice”. Full instructions are here. The installation instructions are specific to your operating system (OS), with options for each OS under ‘Installing the latest gcloud CLI version’.

Proceed with default settings until:

In Windows, on the final step of running the GoogleCloudInstaller, you can un-check the boxes for creating a Start Menu & Desktop shortcut.

When prompted, specify ggn-nmfs-afscdsm-dev-1 as the default project to use.

There is no need to configure compute region and zone.

Connecting to your node

Because usage of GCP is metered, you will need to start and, importantly stop, your node every time you want to use it.

  1. Navigate to the AFSC server on GCP.

  1. The first time you navigate to Google Cloud and connect to your node, there will be many authorization windows. Please accept them. Make sure you are connecting via your NOAA account.

  2. You should be in the ‘Instances’ pane, with a list of nodes associated with people’s names. Click on the check box in the line associated with your username.

  3. Press START / RESUME in the blue menu bar above the list of nodes. After a moment, the status icon for the node will be a green circle with a checkmark instead of a gray circle with a square.

  1. In a text editor, copy in the following connection code, substituting [SERVER NAME] with the one that has been configured for you. Do not include the brackets around your name (it should be the same as the name of your node in GCP). gcloud compute ssh --ssh-flag="-4 -L 8787:localhost:8787" [YOUR-NAME]-sdm-node --project=ggn-nmfs-afscuservm-infra-1 --zone=us-east4-c --tunnel-through-iap

    e.g., gcloud compute ssh --ssh-flag="-4 -L 8787:localhost:8787" [SERVER NAME] --project=ggn-nmfs-afscuservm-infra-1 --zone=us-east4-c --tunnel-through-iap

    We recommend saving this connection string to a text file on your local computer. You will need to use it every time you connect to your instance.

  1. On your local machine, open command prompt (Windows) or terminal (Mac/Linux).

  2. Copy the connection string into command prompt & hit enter.

  1. A ‘PuTTY’ window will open. When you connect for the first time, a ‘PuTTY Security Alert’ window will follow because you need to set up an ssh key pair. Click ‘Accept’ in this window. Keep the PuTTY window open to maintain your connection.

  2. When you are done with your session, make sure to turn off the instance by pressing ‘STOP’ in the blue menu bar. The connection to the command line and Rstudio will be terminated and the status icon will return to the gray square. This is very important for keeping operating costs reasonable.

Connecting to Rstudio Server

These instances have been built with a container image on top of rocker:rstudio that comes preconfigured with packages to run tinyVAST and sdmTMB workloads and to manage data ingress and egress through Oracle and Google Drive. This means that all further setup and operations are conducted from inside an Rstudio Server. Connecting is very easy, as the basic requirements and connections have already been set up by OFIS.

To connect:

  1. Once you have connected to your instance through step 8 above, open a new tab in your browser and navigate to localhost:8787.

  2. Log in using the username and password ‘rstudio’

You should now be able to use Rstudio as you would on your local machine.

Configuring SSH for Github

Configuring an SSH key for your instance is required to push to github. It is possible to clone repositories using https, but you will not be able to push any changes. You will need to do the following in the Terminal within Rstudio Server.

  1. Switch to the Terminal tab (instead of Console) of the Rstudio Server window. Generate a new SSH key and add it to the ssh-agent, following the instructions for Linux on github. You do not need to specify a ‘file in which to save the key’ or a passphrase. Paste or type in the following text, replacing the email address with your Github email address:

ssh-keygen -t ed25519 -C "your_email@example.com"

[Enter]

[Enter]

eval "$(ssh-agent -s)"

ssh-add ~/.ssh/id_ed25519

  1. Once you have created the key and added it to the ssh-agent, follow the github instructions for adding a new SSH key to your github account, again following the Linux instructions.

  2. When you’re prompted to “Enter a file in which to save the key”, press Enter to accept the default file location. You also do not need to enter a passphrase (can press ‘Enter’ to not use one, and then press ‘Enter’ again to confirm).

[Enter]

[Enter]

  1. In the RStudio Terminal, copy the SSH public key to your clipboard with the following. Select and copy what is printed. You may need to select and then right-click the text to copy, if ctrl + C does not work.

cat ~/.ssh/id_ed25519.pub

  1. On GitHub, click on your profile photo at the upper-right corner of the page and click ‘Settings’ in the menu.

  1. In the ‘Access’ section of the sidebar, click on ‘SSH and GPG keys’.

  1. Click New SSH key or Add SSH key. Make sure to use an informative title, such as gcp-rstudio-20250507. In the ‘Key’ field, paste your public key. Click ‘Add SSH Key’.

  1. Back in the Rstudio Server terminal, clone the github repository using SSH, using the command git clone. Make sure you are in the directory where you want the repo to be cloned; it will default to your ‘home’ directory, which will be fine for most circumstances. If you have created a folder within which you would like the repo to live, you can navigate inside of it with the command cd, followed by the directory name, in the Rstudio Server terminal. The address for a repository can be copied from its github page if you click the green <> Code button and select SSH.

e.g., git clone git@github.com:afsc-gap-products/model-based-indices.git

  1. The first time you clone a repository from github, you will see the warning that “The authenticity of host ‘github.com’ can’t be established”, followed by a prompt asking: “Are you suer you want to continue connecting (yes/no/[fingerprint])?” Type yes and enter.

  2. To connect the git integration in Rstudio with your cloned repository, you need to create an Rstudio Project associated with it. Click on the File menu in Rstudio Server -> ‘New Project’ -> ‘Existing Directory’ -> browse to the cloned repo.

Configuring Google Drive

Each instance is set up with a connection to a Google Drive associated with a unique email account. To connect the instance to your google account, you will need to run the following code to authenticate your credentials. The code will prompt you to provide your email address for Google Drive (which is likely your NOAA email).

library(gargle)
library(googledrive)

# googledrive::drive_auth(path="/etc/sa_key.json")  # connect to default account

# Connect to google drive using your (probably NOAA) email
gdrive_email <- rstudioapi::showPrompt(title = "Email",
                                       message = "Email for Google Drive",
                                       default = "")

drive_auth(token = credentials_user_oauth2(
  scopes = "https://www.googleapis.com/auth/drive", 
  email = gdrive_email))

drive_user()  # check user account

Oracle connection

You will need an oracle account that has access to the AFSC schemas to use the following code as-written. To streamline the process, you can save a file to your instance that contains your username and password. I created an R script in my home directory on the instance with the content:

oracle_user <- "USERNAME"
oracle_pw <- "password"

If you prefer to type your username and password in when accessing Oracle, the following code will prompt you for the information when needed. Use the following code to establish the connection.

if(file.exists("~/oracle_credentials.R")) { 
  source("~/oracle_credentials.R")
} else {
  oracle_user <- rstudioapi::showPrompt(title = "Username",
                                        message = "Oracle Username",
                                        default = "")
  oracle_pw <- rstudioapi::showPrompt(title = "Password",
                                      message = "Oracle Password",
                                      default = "")
}

channel <- RODBC::odbcConnect(dsn = "AFSC",
                              uid = oracle_user,
                              pwd = oracle_pw,
                              believeNRows = FALSE)