Empirical Software Engineering using R

About the book

This book aims to discuss all of what is currently known about software engineering, based on an analysis of all publicly available software engineering data.
This aim is not as ambitious as it sounds because there is not a great deal of data publicly available. Until recently researchers in software engineering concentrated on producing work that gave readers mathematical orgasms, rather than anything useful to industry based on experimental evidence.
As work progressed, it became obvious that the best way to organise the material was as two parts, one covering software engineering and the second the statistics used in the analysis of software engineering data.

Stuff to look at

version 0.9.0a: draft covering human cognitive characteristics, cognitive capitalism, ecosystems, projects, reliability+ statistical techniques for analysing software engineering data.

View all figures; includes links to original paper for the data and source code on Github.

Reporting issues using Github is good for general discussion and makes it hard for me to ignore them (which I might do with email).

All referenced papers.


The code+data is available on Github (around 370M).


Slides and stuff for the workshops based on the book.

Major changes

3 Apr 18 Reliability chapter released
26 Nov 17 Tweaks, plus updated with newly arrived data
27 Oct 17 Projects chapter released
17 Jul 17 Ecosystems chapter released
26 Mar 17 Cognitive capitalism chapter released
29 Jan 17 Human cognitive characteristics chapter released
17 Oct 16 Statistical analysis material released

Minor changes

18 Feb 17 Fixed citation hyperlinks and added page number(s) on which citation is referenced to every citation.
Please send any feedback to ESEUR "at" knosof dot co dot uk

Last updated