Getting the atmospheric model data, remotely

As Atmospheric Data Scientists, we would like to reach out to the latest outputs of the atmospheric models (GFS, NAM, HRRR, GEFS). However, we are not very keen on downloading all those data into our local machine. As a solution, this blog is going to engage with atmospheric model outputs remotely.

# Importing required libraries
import xarray as xr # dealing with atmospheric data
import numpy as np # dealing with arrays
import requests # making requests to webpages
from bs4 import BeautifulSoup # scraping webpage content

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")
# The link of the server (nomads) we are going to utilize while obtaining our data
nomads_link = r'https://nomads.ncep.noaa.gov/dods/'

Possible atmospheric model datasets

Let’s see the atmospheric model lists on nomads server.

# Make a request to the nomads server link
page = requests.get(nomads_link)

# View the page content
soup = BeautifulSoup(page.content)
# Page content under <b> tag
soup.find_all('b')
[<b>1: akrtma/:</b>,
 <b>2: aqm/:</b>,
 <b>3: blend/:</b>,
 <b>4: cmcens/:</b>,
 <b>5: estofs/:</b>,
 <b>6: fens/:</b>,
 <b>7: fnl/:</b>,
 <b>8: gdas_0p25/:</b>,
 <b>9: gefs/:</b>,
 <b>10: gens/:</b>,
 <b>11: gens_bc/:</b>,
 <b>12: gens_ndgd/:</b>,
 <b>13: gfs_0p25/:</b>,
 <b>14: gfs_0p25_1hr/:</b>,
 <b>15: gfs_0p25_1hr_parafv3/:</b>,
 <b>16: gfs_0p25_parafv3/:</b>,
 <b>17: gfs_0p50/:</b>,
 <b>18: gfs_1p00/:</b>,
 <b>19: gurtma/:</b>,
 <b>20: hiresw/:</b>,
 <b>21: hirtma/:</b>,
 <b>22: hrrr/:</b>,
 <b>23: ice/:</b>,
 <b>24: naefs_bc/:</b>,
 <b>25: naefs_ndgd/:</b>,
 <b>26: nam/:</b>,
 <b>27: narre/:</b>,
 <b>28: ncom/:</b>,
 <b>29: prrtma/:</b>,
 <b>30: rap/:</b>,
 <b>31: rtma2p5/:</b>,
 <b>32: rtofs/:</b>,
 <b>33: sref/:</b>,
 <b>34: sref_bc/:</b>,
 <b>35: wave/:</b>]

As you see, some of the available atmospheric model dataset abbreviations (#35) on the nomads server are visible. We are going to make a list of these datasets.

# A function for making the list of model abbreviations
def get_data_abbreviations(soup):
    """
    Returns the model abbreviation list by searching 
    the page content under <b> tag
    
    Arguments:
        soup {BS object} -- BeautifulSoup object
        
    Returns:
        {list} -- model abbreviation list
    """
    
    # Search for <b> tag within the content and keep them
    b_tag = soup.find_all('b')
    
    # Get text of each content element
    texts = [b.get_text() for b in b_tag]
    
    # Texts are not in the format that we expect. Just do a little bit of trick
    abbrs = [text.split(' ')[1].split('/')[0] for text in texts]
    
    return abbrs
# Let's check some of the elements of the abbreviation list
abbr_list = get_data_abbreviations(soup)
print('Length of model data: ', len(abbr_list))
print('some of the elements: ', abbr_list[:5])
Length of model data:  35
some of the elements:  ['akrtma', 'aqm', 'blend', 'cmcens', 'estofs']

Available data dates

On the nomads server, atmospheric model data generally extend back up to 1 week. Nevertheless, let’s reveal the possible date range of the data we can study.

def get_possible_data_range(link, model_name):
    """
    Returns the list of the available data dates by searching 
    the page content under <b> tag
    
    Arguments:
        link {str} -- general link to nomads dod
        model_name {str} -- expected model data abbreviation
        
    Returns:
        {list} -- model available date list 'YYYYMMDD'
    """
    
    # Make a request to the model page
    page = requests.get(link + f'/{model_name}')
    
    # View the page content
    soup = BeautifulSoup(page.content)
    
    try:
        # Search for the <b> tag within the content and keep them
        b_tag = soup.find_all('b')

        # Get the text of each content element
        texts = [b.get_text() for b in b_tag]

        # Texts are not in the format we expect. Just do a little bit of trick
        dates_with_abbr = [text.split(' ')[1].split('/')[0] for text in texts]
        
        return dates_with_abbr
    
    # Exception
    except Exception as error:
        print(f'Something is uncommon in nomads server with this specific data: {model_name}' + repr(error))
# Let's see the date range of one of the model data
possible_dates = get_possible_data_range(nomads_link, abbr_list[0])
print('Length of date range: ', len(possible_dates))
print('some of the elements: ', possible_dates[:3])
Length of date range:  10
some of the elements:  ['akrtma20211002', 'akrtma20211003', 'akrtma20211004']

Collect the model data at avaliable date range

Now, by using the available model data abbreviations, their available date ranges, and their respective data links on the nomads server, we will look for the available data that we can fetch and study.

# Let's get the model data
abbr_no = 0 # can be one of the 35 available model name
date_no = 0 # can be one of the dates available within date range
link_no = 0 # can be one of the links available within data links

model_data_fetch_link = f'{nomads_link}/{abbr_list[abbr_no]}/{possible_dates[date_no]}/{specific_data_links[link_no]}'
data = xr.open_dataset(model_data_fetch_link)
# Let's see what's in the data
data
<xarray.Dataset>
Dimensions:   (time: 1, lat: 1302, lon: 4336)
Coordinates:
  * time      (time) datetime64[ns] 2021-10-02
  * lat       (lat) float64 40.54 40.57 40.59 40.62 ... 75.31 75.34 75.36 75.39
  * lon       (lon) float64 150.2 150.2 150.2 150.3 ... 266.2 266.3 266.3 266.3
Data variables: (12/13)
    ceilceil  (time, lat, lon) float32 ...
    dpt2m     (time, lat, lon) float32 ...
    gust10m   (time, lat, lon) float32 ...
    hgtsfc    (time, lat, lon) float32 ...
    pressfc   (time, lat, lon) float32 ...
    spfh2m    (time, lat, lon) float32 ...
    ...        ...
    tmp2m     (time, lat, lon) float32 ...
    ugrd10m   (time, lat, lon) float32 ...
    vgrd10m   (time, lat, lon) float32 ...
    vissfc    (time, lat, lon) float32 ...
    wdir10m   (time, lat, lon) float32 ...
    wind10m   (time, lat, lon) float32 ...
Attributes:
    title:        akrtma.t00z.2dvaranl_ndfd_3p0.grb2 beginning 00Z02oct2021, ...
    Conventions:  COARDS\nGrADS
    dataType:     Grid
    history:      Sat Oct 09 12:37:39 GMT 2021 : imported by GrADS Data Serve...

It is a little bit long code, right?

For this reason, we have created the Visjobs, which, once installed, can bring you the latest atmospheric model data, including GFS, GEFS, HRRR, NBM and NAM. You can also get the GHCN(Global Historical Climatology Network) observation data for each station. Check it out!

Github Project: https://github.com/donmezkutay/visjobs
Documentation: https://donmezkutay.github.io/visjobs/


Blog by:
Kutay DÖNMEZ : LinkedIn | Github | Twitter | Instagram
Berkay DÖNMEZ : Linkedin | Github | Twitter | Instagram

ABOUT US