Home About us Mathematical Epidemiology Rweb EPITools Statistics Notes Web Design Search Contact us |

> Home > Computational Epidemiology Course

To learn a programming language is to learn to deal with information and computation in its most general setting.

Problems in real life have a way of being messy and inconvenient; they don't always fit our expectations - and they don't always fit the expectations of our software.

Proposals to eliminate programming have surfaced down through the decades. Many people believe that any tool sufficiently general to be able to handle the proverbial next problem is likely to have a certain minimum level of complexity. With generality and power come unavoidable complexity.

The right tool makes any job easier. We commonly use

- Spreadsheet programs (began with Visicalc, then Lotus 1-2-3, MS Excel, Gnumeric)
- Statistics packages (SAS, STATA, SPSS, Systat, R, S-Plus, etc.)
- Domain-specific visual languages (Madonna, software for cost-effectiveness)
- Database programs (Oracle, Access)
- Programming languages/Scripting languages (MATLAB, R, S-Plus, Perl, Python, Smalltalk, etc.)

In this class, we concentrate on the fifth of these, namely programming languages. In particular, we will work largely with R, a scripting language and statistics package.

The R project has a homepage, where you may obtain the software and manuals. The manual describes R as a dialect of S, a data analysis language originally developed at AT&T.

Although R differs from S in some ways, the classic books on S are nevertheless quite useful. Here are several books that are worth looking at:

**The New S Language**by R. A. Becker, J. M. Chambers, and A. R. Wilks, Wadsworth and Brooks/Cole Computer Science Series, Pacific Grove, California, 1988.*This was written by the inventors of S to introduce the language; informally called the "blue book".***Statistical Models in S**, ed. by J. M. Chambers and T. J. Hastie, Wadsworth and Brooks/Cole Computer Science Series, 1992.*Introduces statistical analysis and computing in S; sometimes called the "white book"*.**Modern Applied Statistics with S-Plus**by W. N. Venables and B. D. Ripley, Springer Verlag, New York, 1994.**An Introduction to S and S-Plus**by Phil Spector, Duxbury Press, Belmont, California, 1994.**Modern Applied Biostatistical Methods using S-Plus**by Steve Selvin, Oxford University Press, New York, NY, 1998.

**Data Analysis and Graphics using R**by J. Maindonald and J. Braun, Cambridge, 2003.

In the first semester, we will focus largely on the fundamentals of computation, using examples drawn from epidemiology and public health. In the second semester, we will look at more complex models involving stochastic simulation of disease transmission on networks, cost-effectiveness analysis, and demography.

In terms of administrative requirements, I'd like each of you to find an interesting problem in your own work or research that we may apply computation to. Each person will be asked to prepare a short 2 page analysis of the problem of their choice due at the end of the semester. For those taking the class for a letter grade, we will have short weekly problem sets. Our goal is to provide each student with maximum opportunity for challenge and feedback, in a flexible way that meets the needs of working professionals.

Outline.

- Literals: Numeric, Character, and Boolean, and simple operations. Function calls.
- Identifiers, Assignment, and Functions.
- Vectors and vectorized functions. Random numbers. Plots.
- More about vectors. Boolean comparisons. Filtering. Named elements.
- Tables. Reductions. Fold. Outer.
- Decision (If/Else). Sequential repetition. Simple needle reuse example.
- The for loop.
- The while loop. The list
- Data frames. The switch statement. Compound statements.
- Named arguments. Default values.
- The binomial distribution. Exact confidence intervals. Optimization and root finding.
- Quantiles. A simple simulation.
- Reading data files. Analysis of infectivity of HIV from the Young Men's Health Study.

All content © 2003 Mathepi.Com (except R and Rweb).

About us.