Computing/Getting ERDDAP Running

From Cricalix.Net

ERDDAP is software for serving up scientific data. To run and configure it, the Docker approach works fairly well (even running on a Synology NAS).

Installation

  1. docker pull axiom/docker-erddap:latest-jdk17-openjdk

Configuration

  1. Read https://coastwatch.pfeg.noaa.gov/erddap/download/setupDatasetsXml.html very carefully.
    1. It's long. It's thorough.
  2. Read through https://github.com/unidata/tomcat-docker to see if any Tomcat bits apply (unlikely, will use Caddy2 to front this Tomcat)
  3. Read through https://registry.hub.docker.com/r/axiom/docker-erddap/ and (source) https://github.com/axiom-data-science/docker-erddap
  4. Run the container at least once and extract /usr/local/tomcat/content/erddap/setup.xml and datasets.xml
  5. Tune the XML files
    1. Even though Axiom's docker can do everything with environment variables, setup.xml must exist
  6. Re-run the docker instance with file pass-throughs
    1. Don't forget to pass through the bigParentDirectory, because the logs are there
    2. In fact, there's a ton of stuff there that is needed for persistent operation
  7. A sample JSON file is needed where the dataset indicates the files will exist
    1. The record should be deletable once the system is running and has real data flowing
  8. Use the DasDds.sh script in the WEB-INF directory to validate the dataset.xml file
    1. Given this is docker, either docker exec -it <name> /bin/bash or invoke it with sufficient pass-throughs to run the script directly from the host

On restart, the service shows the default landing page, but shortly thereafter replaces it with any modified content. This implies that the pass-through of the XML files somehow doesn't take effect until the service has re-read them after docker startup.

Logs

  1. Catalina logs are uninteresting
  2. Application logs are not exported to docker's log system
  3. Application logs are in the bigParentDirectory passed-through volume

HTTPGet errors

  1. sourceName does not do case translations to destinationName. They must be identical.
  2. Configuration of the dataset attributes uses (effectively) the destinationName spelling
  3. time, longitude, latitude must be spelt that way
  4. author must be the last field on the request, and it points to the keys (username_$password)
  5. command must exist in the definition, but never set
  6. Yes, it's HTTP GET -> data write happens. It's terrible and goes against the RFC for web server implementation.

AIS -catcher integration

The whole goal of this setup is to grab AIS data from the meteorological sensors in Dublin Bay. A simple proof-of-concept parser follows.

#!/usr/bin/env python3

import json
import urllib.request
import urllib.parse
from datetime import datetime

URL="https://erddap.home.arpa/erddap/tabledap/datasetName.insert"
MAPPER = {
    "signalpower": "Signal_Power",
    "wspeed": "Wind_Speed",
    "wgust": "Wind_Gust_Speed",
    "wdir": "Wind_Direction",
    "wgustdir": "Wind_Gust_Direction",
    "lon": "longitude",
    "lat": "latitude",
    "waveheight": "Wave_Height",
    "waveperiod": "Wave_Period",
    "mmsi": "MMSI",
}
 
with open("ais.capture", "r") as f:
    lines = f.readlines()

lines = [json.loads(x) for x in lines if "wspeed" in x and "01301" in x]

for line in lines:
    kv = {}
    rxtime = line["rxtime"]
    dt_rxtime = datetime.strptime(rxtime, "%Y%m%d%H%M%S")
    kv["time"]=dt_rxtime.strftime('%Y-%m-%dT%H:%M:%SZ')
    kv["Station_ID"] ="Dublin_Bay_Buoy"
    for field, value in MAPPER.items():
        kv[value]=line[field]
    kv["author"] ="testuser_somepassword"
    params = urllib.parse.urlencode(kv)
    url = f"{URL}?{params}"
    with urllib.request.urlopen(url) as f:
        print(f.read().decode('utf-8'))

A more thorough implementation is found at https://github.com/cricalix/erddap-feeder - a small Rust program that can listen for HTTP POST requests from ais-catcher and transform them to HTTP GET requests against the ERDDAP instance.