Me, blogging!

thoughts, science, code, jobs, and thoughts

Building and Running CESM

Recently I applied for a job where one requirement is to have experience with the Community Earth System Model (CESM). Therefore, this post is about how to build and run the publicly available CESM under Mac OS X. It is fairly straightforward but you need to pay some attention because Earth System Models are naturally complex and complicated software frameworks. Here are the steps to run CESM on your Mac.

  1. Register at the CESM website to obtain credentials for the source code download.
  2. Checkout the latest release (or whatever version you prefer) from the CESM SVN server
  3. Prepare and build a custom setup following the instructions by Walter Hannah
  4. Run the CESM model

Checkout CESM source code from SVN repository

svn co --username guestuser https://svn-ccsm-models.cgd.ucar.edu/cesm1/release_tags/cesm1_2_2 cesm1_2_2

Configure your machine, compiler, and library settings

You have to tell the CESM setup framework that you would like to use the GNU compiler, the NetCDF and the Parallel-NetCDF libraries for your "undefined" Mac OS X machine. Therefore, edit ccsm_utils/Machines/config_compilers.xml:

<compiler MACH="userdefined">
  <ADD_FFLAGS> -fno-range-check -fcray-pointer -arch x86_64 </ADD_FFLAGS>
  <NETCDF_PATH>/usr/local/Cellar/netcdf/4.3.3.1_4</NETCDF_PATH>
  <PNETCDF_PATH>/usr/local/Cellar/parallel-netcdf/1.7.0</PNETCDF_PATH>
  <ADD_SLIBS>$(shell $(NETCDF_PATH)/bin/nc-config --flibs)</ADD_SLIBS>  
  <ADD_CPPDEFS></ADD_CPPDEFS>
  <CONFIG_ARGS></CONFIG_ARGS>
  <ESMF_LIBDIR></ESMF_LIBDIR>
  <MPI_LIB_NAME></MPI_LIB_NAME>
  <MPI_PATH></MPI_PATH>
</compiler>

Create a CESM setup

Now you are ready to create a new CESM setup I choose to run an Aquaplanet simulation in CAM5 hoping that this will not take too long to complete. In cesm1_2_2/scripts type

./create_newcase -case <your case> -res T31_g37 -compset 2000_CAM5_SLND_SICE_AQUAP_SROF_SGLC_SWAV -mach userdefined 

Other configurations, i.e., the compset, can be found on the CESM website. Next, adjust the XML configuration files in cesm1_2_2/scripts/<your_case> where <your_case> is the name of the CESM setup you created before

./xmlchange -file env_build.xml -id GMAKE_J -val 8
./xmlchange -file env_build.xml -id GMAKE -val make
./xmlchange -file env_build.xml -id OS -val darwin 
./xmlchange -file env_build.xml -id MPILIB -val mpich 
./xmlchange -file env_build.xml -id COMPILER -val gnu
./xmlchange -file env_build.xml -id CESMSCRATCHROOT -val ~/Projects/cesm1_2_2
./xmlchange -file env_build.xml -id EXEROOT -val ~/Projects/cesm/my_model/bld

and for the build environment

mkdir -p ~/Projects/cesm/my_model
mkdir -p ~/Projects/cesm/input
./xmlchange -file env_run.xml -id RUNDIR -val ~/Projects/cesm/my_model/run
./xmlchange -file env_run.xml -id DIN_LOC_ROOT -val ~/Projects/cesm/input

and to change the default number (64) of used CPUs to 2:

./xmlchange -file env_mach_pes.xml -id MAX_TASKS_PER_NODE -val 1
./xmlchange -file env_mach_pes.xml -id NTASKS_ATM -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_LND -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_ICE -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_OCN -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_CPL -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_GLC -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_ROF -val 2    
./xmlchange -file env_mach_pes.xml -id NTASKS_WAV -val 2    
./xmlchange -file env_mach_pes.xml -id TOTALPES -val 2

These changes are processed via ./cesm_setup and now you can start the build process in <your case>/

./<your case>.build

Run your CESM model

Finally, you need to uncomment one of those two lines in <your case>.run

#mpiexec -n 2 $EXEROOT/cesm.exe >&! cesm.log.$LID
#mpirun -np 2 $EXEROOT/cesm.exe >&! cesm.log.$LID

Now, you can run your CESM setup on your Mac OS X

./<your case>.run

and, hopefully, after a while your console prints out something like this:

-------------------------------------------------------------------------
 CESM BUILDNML SCRIPT STARTING
 - To prestage restarts, untar a restart.tar file into ~/Projects/cesm/my_model/run
 infile is ~/Projects/cesm1_2_2/scripts/<your case>/Buildconf/cplconf/cesm_namelist 
CAM writing dry deposition namelist to drv_flds_in 
CAM writing namelist to atm_in 
 CESM BUILDNML SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
-------------------------------------------------------------------------
 CESM PRESTAGE SCRIPT STARTING
 - Case input data directory, DIN_LOC_ROOT, is ~/Projects/cesm/input
 - Checking the existence of input datasets in DIN_LOC_ROOT
 CESM PRESTAGE SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
Thu Mar 24 12:02:31 CET 2016 -- CSM EXECUTION BEGINS HERE
Thu Mar 24 12:24:58 CET 2016 -- CSM EXECUTION HAS FINISHED
(seq_mct_drv): ===============       SUCCESSFUL TERMINATION OF CPL7-CCSM ===============

Here's the zonal mean of the air temperature. I didn't pay too much attention to the axis descriptions but I hope you acknowledge this as a result ;)

Zonal mean of air temperature

Tips and Tricks and Errors

  • To make the parallel-netcdf library available to everybody using homebrew for Mac OS X I already created a PR on Github.
  • During the SVN checkout the MCT directory in models/utils was not updated. You can do so manually by entering git clone https://github.com/MCSclimate/MCT.git mct in models/utils.
  • Replace isnanf with isnan in shr_isnan.c as described here; other potential errors during that process are listed, explained and resolved here.
Tags: models, code, CESM 2016/03/24
Exceedance probability curves

While working on a project proposal to estimate drought risks of some countries in the coming years, I learned something about risk analysis: It's rather simple! But, you need to be cautious if you want to use this information for guessing what might happen in the future. In the context of anthropogenic climate change that usually means that you can't rely on exceedance probabilities from observed (and hence past) data. Because the underlying climate is changing. And even if that's not the case, your record may be over- or underestimating the occurrences of specific events, just by pure chance.

However, let me show you how to quickly derive exceedance probabilities from some river discharge data.

From the monthly discharge data, we estimate the cumulative distribution function (CDF) by assigning each sorted monthluy value a number starting from 0 for the lowest up to 1 for the highest monthly discharge value. The exceedance probability is simply 1 - CDF. Plotting the exceedance probability against the sorted monthly data, et voilĂ , your exceedance probability curve.

Exceedance Probability Curve

Here's the link to the CSV file elbe.csv and the Python code to process the data.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from scipy import stats
import sys
import matplotlib
from matplotlib.ticker import ScalarFormatter, FormatStrFormatter, FixedLocator

matplotlib.style.use('bmh')

fnm  = 'elbe.csv'
df = pd.read_csv(fnm,index_col=0,infer_datetime_format=True,parse_dates=True)
raw = df['Elbe']
ser = raw.sort_values()

X = np.linspace(0.,max(ser.values)*1.1,100.)
s, loc, scale = stats.lognorm.fit(ser.values)

cum_dist = np.linspace(0.,1.,len(ser))
ser_cdf = pd.Series(cum_dist, index=ser)
ep = 1. - ser_cdf

fig = plt.figure(figsize=(11,8))

ax1 = plt.subplot(221)
ser_cdf.plot(ax=ax1,drawstyle='steps',label='data')
ax1.plot(X,stats.lognorm.cdf(X, s, loc, scale),label='lognormal')
ax1.set_xlabel('Discharge')
ax1.set_ylabel('CDF')
ax1.legend(loc=0,framealpha=0.5,fontsize=12)

ax2 = plt.subplot(223)
ax2.hist(ser.values, bins=21, normed=True, label='data')
ax2.plot(X,stats.lognorm.pdf(X, s, loc, scale),label='lognormal')
ax2.set_ylabel('Probability Density')
ax2.set_xlabel('Discharge')
ax2.legend(loc=0,framealpha=0.5,fontsize=12)

A = np.vstack([ep.values, np.ones(len(ep.values))]).T
[m, c], resid = np.linalg.lstsq(A, ser.values)[:2]
print m, c
r2 = 1 - resid / (len(ser.values) * np.var(ser.values))
print r2

ax3 = plt.subplot(222)
ax3.semilogx(100.*ep,ser.values,ls='',marker='o',label='data')
ax3.plot(100.*(1.-stats.lognorm.cdf(X, s, loc, scale)),X,label='lognormal')
minorLocator = FixedLocator([1,2,5,10,20,50,100])
ax3.xaxis.set_major_locator(minorLocator)
ax3.xaxis.set_major_formatter(ScalarFormatter())
ax3.xaxis.set_major_formatter(FormatStrFormatter("%d"))
ax3.set_xlim(1,100)
ax3.set_xlabel('Exceedance Probability (%)')
ax3.set_ylabel('Discharge')
ax3.invert_xaxis()
ax3.legend(loc=0,framealpha=0.5,fontsize=12)

ax4 = plt.subplot(224)
raw.plot(ax=ax4)
ax4.set_xlabel('time')
ax4.set_ylabel('Discharge')

plt.suptitle('Elbe discharge',y=1.01,fontsize=14)

plt.tight_layout()
plt.show() 

fig.savefig('exceedance_curve_elbe.png',transparent=True,dpi=150,bbox_inches='tight')
Tags: statistics, python, code, models 2016/02/20
Keeping (time) track of different projects

I work for different projects. And it is not easy to keep track of the time spent on these different projects. There are, of course, tools which handle time tracking for you. But they are either for freelancers, too complicated, or expensive. Here's an alternative: go-watch.

I've tried go because the compiled executables can be deployed everywhere. Try that, Python! In principle you would just need a compiled version for your operating system but here's the clue: Try to install it on your own:

All you need is a go installation and a clone of the go-watch project:

git clone git://github.com/mkrapp/go-watch.git

Screenshot of go-watch

Tags: productivity, go, code 2015/10/06
New old project

spiral wave A few days ago, I uploaded a very old (almost ancient) code project to Github. It is a relic of my studies at the university (2007!) and I have used this code for my diploma thesis. After tweaking the code quite a bit, I managed to get it running and spitting out this nice spiral wave (which makes this blog look a lot more colorful than originally I intended).

From the entire Java files I produced back than I included only the one that have a good purpose (like the one above). I also had to add a build file for ant because originally, this project was developed in Eclipse. I am quite happy to see the old stuff still working.

Please, go and try yourself:

git clone https://github.com/mkrapp/rde-solver

and then

ant [clean,compile,jar,run]

To produce the actual spiral wave you have to process the created data files. That part is not included. If you like to see how this works, drop me an email.

Tags: github, code, java, diploma thesis, models 2015/01/08
Building a website generator

Hello everyone. This is my first posting on my blog where I want to provide some useful information about my job search. I started of as a physicist (TU-Berlin) about six years ago. Afterwards I went to Hamburg to do my graduate studies at the Hamburg University and the Max Planck Institute for Meteorology. After finishing my Ph.D. I went back to Berlin to work at the Potsdam Institute for Climate Impact Research as a postdoctoral researcher. My project about the melting of the Greenland Icesheet on different time scales ended two weeks ago. Now I'm taking my time to recollect what I've done so far and what I want to do next. This blog should help me and others to keep track about the expected and unexpected things to come, difficulties and the joy and pain of finding a new job outside of academia.

As of this week, I started writing a static blog engine in Python (pystable), which can be found on Github and cloned via

git clone https://github.com/mkrapp/pystable.git

Actually, this very web page is based on pystable.

Tags: pystable, code, python, introduction 2014/11/28
About Mario Krapp
I'm a physicist by training and graduated in Earth System Science.

And I like coding. I've been working with complex computer models ever since my undergrad and I enjoy data exploration and data analysis to gain insights into the underlying principles.

Feel free to contact me.

Info
Archive
Tags