Me, blogging!

thoughts, science, code, jobs, and thoughts

Cookiecutter for a more transparent and reproducible science

Last week I watched an inspiring tutorial ("Data Science is Software") from the SciPy 2016. And this reminded my about the debate how reproducible science really is. It seems that the majority of scientists agree that there is a reproducibility crisis.

That is indeed frustrating, also because research itself should be transparent. I also understand why that is the case from my own experiences: Imagine you did some fine research and you've saved your results for a later publication. The results themselves have been hard to obtain, either beacuse of extensive data analysis, lenghty computer simultions or scraping data from a larger database. Later that year you realised that some of the results are corrupt or you did a mistake in the analysis or you had a wrong configuration. In this case you try to reproduce your own results. However, this is a tedious task, so you go back to the beginning a try to repeat each single step to get to your final result. Fine! But wait. Why not simply automating this process from the beginning? Yeah, sure, but how?


So I adopted the philosophy of the guys from the video above, which states that in order to deliver reproducible (data) science you need "A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.", and developed a similar cookiecutter project template aimed for scientists.

You can find the source on Github and simply install it for your next science project:

cookiecutter gh:mkrapp/cookiecutter-reproducible-science

This tool provides you the basic structure of your research project. You can adjust it to your needsand need to fill it with your reproducible workflow. You can add a customized Makefile, add you preferred command line tools, scripts, or model source code. You can also add raw data and processed data. You can document and also write you scientific paper within this structure.

Let's do more reproducible and transparent science!!!

PS: I'm also currently adapting one of my science projects and I'm going to provide the final reproducible version later on (I need to finish, first).

Tags: design, productivity, reproducibility, github 2016/07/25
To design a Website

I really have had to make up my mind to come up with a simple and sleek website. It took me several days to think how my personal homepage should look like. Although, nowadays, every institute's home page provides a staff list where (usually) every staff member is asked to upload some content—see my personal PIK page, for example—but this kind of design and content is not what I was looking for. And I love it simple!

So, I decided that my personal website should have a simple layout, at least the entry page. Therefore, I chose to use tabs that, while popping up, show what the site is about. It doesn't, by the way, as of now! Who I am, showing a short CV, and showing what I'm interested in. The main entry page should also link to this blog which it does right now.

So far, so good. To allow some interactive behavior I included CSS overlays that pop up while clicking on the respective tab. For a corporate website, I also use the same font Josefin Slab and EB Garamond which you can get from Google Fonts here and here. I like the style of that fonts; especially Josefin Slab reminds me of the 20's or 30's. And EB Garamond is quite good to read as well.

Tags: design, html, css 2014/12/08
About Mario Krapp
I'm a physicist by training and graduated in Earth System Science.

And I like coding. I've been working with complex computer models ever since my undergrad and I enjoy data exploration and data analysis to gain insights into the underlying principles.

Feel free to contact me.