I’m writing a book on R for O’Reilly, and I need interesting datasets for the examples. Any data that you provide will get you a mention in the book and in the publicity material, so it’s a great opportunity to publicise your work or your organisation.
Datasets from any area or industry are suitable; the only constraint is that it can be analysed with a few pages of R code to provide a result that a general reader might go “ooh”. There’s a chapter on data cleaning, so even dirty data is suitable!
All the data will be provided in an R package to accompany the book, so you need to be willing to make it publically available. I can help you anonymise the data, or strip out commercially sensitive parts if you require.
If you can provide anything, or you know someone who might be able to, then drop me an email at richierocks AT gmail DOT com. Thanks.
EDIT: There are some (quite) frequently asked questions already! Here are the answers; you can use your Jeopardy! skills to guess the questions.
1. The book is called “Learning R”, and it’s a fairly gentle introduction to the language, covering both how you program in R, and how you analyse data.
2. If you provide data, then yes, you can have an PDF of the pre-release version to make sure I haven’t done something silly with your dataset.