Home About us Mathematical Epidemiology Rweb EPITools Statistics Notes Web Design Search Contact us |
We've seen now two mechanisms for sequential repetition: recursive function definitions, and the counted (for) loop. Today we will learn a third important mechanism: the conditional (while) loop.
The while loop is used for repeating an action as long a condition is true (until a condition is false).
As an example, let's toss a coin until it comes up tails, and keep track of how long this takes.
> ntrials <- 1
|
Here, we begin the loop with the keyword while, followed by a boolean expression in parentheses. Then, there follows an action to be repeated, typically a compound expression in curly braces representing a sequence of actions. We'll talk more about the curly braces later, but you can see the pattern above. In the coin example you just saw, the boolean expression drew a random coin toss and tested to see whether it was a head. The body of the loop simply added one to the number of trials. Note that before we begin the loop, we initialized the variable ntrials to equal one.
So how does this work? The first thing that happens is that ntrials is set to one. Then, we enter the while loop. First, the loop condition is tested: we evaluate the boolean expression in the parentheses. In this case, we get true if we drew a head and false if we drew a tails on the coin toss. If we got a head, then the boolean expression is true and the loop body is entered. The statements in the body are evaluated; here, ntrials is set to its former value (1) plus one, so at the end of the loop body, ntrials is two. But if we got a tails, the boolean expression is false and the loop body is not entered; the loop is ended and control resumes at the bottom of the loop. At the end of the loop, the variable ntrials will equal the number of times we tossed the coin.
In this case, the variable ntrials will obey a geometric distribution and since it is possible to toss a head, the loop will eventually terminate with probability one. But of course in more complex problems, you might make a mistake, and the boolean expression might return true forever. The loop would not terminate normally and what would happen would depend on your system.
> # Here is an infinite loop > # Don't run this loop!
|
By the way, I slipped in a comment in the previous example. A comment is just a note to the reader to help document the code. Anything that follows a sharp sign (#) to the end of the line is a comment; the R system will simply ignore it. Any R program of any real usefulness will require comments, and it is your responsibility to keep the comments up to date and correct.
This is about all there is to the conditional loop. You could use a conditional loop instead of a counted loop:
> ii <- 1
|
But R provides a statement called break to immediately exit a loop. Here is how it works:
> ii <- 1
|
Another useful thing to know about is the next statement. When you execute a next statement inside a loop body, you skip the rest of the loop body and jump to the beginning of the next cycle of the loop. We'll see examples of this later.
Here, we nest two loops. The outer loop is really a counted loop, using the loop variable ii. The inner loop will terminate whenever the break command is executed, and this happens whenever jj exceeds four. But because the inner loop is located inside the outer loop, it will be executed again, until the outer loop terminates.
> ii <- 1
|
Nested loops are a common way to do something for all combinations of certain variables. Remember we called this the outer pattern. Let's imagine we have four states and five numbers, and we want to print all possible pairs. We can do this easily with nested loops:
|
The next thing we need to learn about are the heterogeneous collection structures that R provides. Today we will discuss the list, and next time the data frame.
A list is quite similar to a vector, except that a list can contain objects of different types, but a vector cannot. And the elements of a list are accessed differently.
Let's create a simple list, using the constructor function list:
> new.list <- list("CA",2,TRUE,81.0) >new.list [[1]] [1] "CA" [[2]] [1] 2 [[3]] [1] TRUE [[4]] [1] 81.0 |
Note that some of what is printed looks very much like a vector of length one. For instance, we have seen things like this before: [1] "CA". Notice that we see four of these vectors of length one, and each begins with [1]. But each of the four begins with something different, the number indicated in double brackets. As is often the case in R, the way the result is printed is a clue as to how to access the item.
In this case, we have created a list with four items in it, and we can access the items by numbers using double brackets:
# continuing above example... > new.list[[1]] >[1] "CA" > is.character(new.list[[1]]) >[1] TRUE > is.numeric(new.list[[2]]) >[1] TRUE |
Lists can contain longer vectors. These are not flattened out:
> another.list <- list(c(1,2,3),"CA",c(TRUE,TRUE)) > another.list[[1]] >[1] 1 2 3 > is.character(another.list[[1]]) >[1] FALSE > another.list[[2]] >[1] "CA" |
The length function will tell you how many items there are in a list:
> another.list <- list("CA",1,2,TRUE,1:100) > length(another.list) >[1] 5 |
Lists can nest inside each other also:
> another.list <- list("CA",2,list(1:10,"NY")) > another.list [[1]] [1] "CA" [[2]] [1] 2 [[3]] [[3]][[1]] [1] 1 2 3 4 5 6 7 8 9 10 [[3]][[2]] [1] "NY" |
Lists can even contain functions, unlike vectors:
> zz <- list("CA",sqrt) > zz[[2]](4) [1] 2 |
You can name items in a list as well:
> the.list <- list(state="CA",num=4) > the.list[["state"]] [1] "CA" > the.list$state [1] "CA" > the.list$num [1] 4 > second.list <- list(num=9,fn=sqrt) > second.list$fn(25) [1] 5 |
A common use for lists is to return more than one value from a function. The last expression evaluated in the body of a function is the function's value. What if you want to return a string and a number? You make sure the last expression evaluated is a list constructor:
> simple.example <- function() { + achar <- sample(c("AZ","CA","NV"),1,replace=TRUE) + anum <- rnorm(1) + list(achar,anum) + } > zz <- simple.example() > zz [[1]] [1] "CA" [[2]] [1] -0.239634 > |
How about a wilder example? Let's pick a number at random from one to three, and return a random function, together with a random number:
> example.fn <- function() { + flist <- list(sqrt,exp,function(x){x^2}) + ind <- sample(1:length(flist),1,replace=TRUE) + anum <- rnorm(1) + list(fn=flist[[ind]],num=anum) + } > zz <- example.fn() > zz $fn function (x) x^2 $num [1] -0.7988386 > |
Here is another way to write the function:
> example.fn <- function() { + flist <- list(sqrt,exp,function(x){x^2}) + list(fn=flist[[sample(1:length(flist),1)]],num=rnorm(1)) + } > zz <- example.fn() > |
You may not select more than one element of a list using a vector subscript. A vector subscript is interpreted hierarchically when you work with a list. So a subscript like c(1,2) is the second element of the first element in the list:
> zlist <- list(1:3,1:5,c("AZ","CA"),1:2) > zlist[[c(1,2)]] [1] 2 > zlist[[c(3,1)]] [1] "AZ" > zlist[[c(4,4)]] Error in zlist[[c(4,4)]] : subscript out of bounds > |
> zlist <- list(1:3,1:5,c("AZ","CA"),1:2) > zlist[[1,2]] Error in zlist[[1,2]] : incorrect number of subscripts > |
So R's lists can be used for many things. They can be used to package together multiple elements for return from a function. They can also be used to group together related elements to produce a composite data structure representing an object. And they can be used to create hierarchical data structures; we won't work with this sort of thing much in this semester, but such structures can be indispensable.
Next time we will learn some of the built in tools for working with lists, especially lapply for applying a function to every element of a list. We'll also learn about the data frame and a few other specialized data structures R provides. Finally we'll talk about the switch statement and this will conclude our overview of the basic structures of the language.