Writing working code—and debugging it

# Get started
set.seed(1234)
library(dplyr)

Two important error messages

> Error in [insert some code] : ! argument is of length zero

x <- c(1, 2, 3)

x[0]
numeric(0)
if (x[0]>2) print(x)
Error in `if (x[0] > 2) ...`:
! argument is of length zero

This means that some object you tried to work with was empty; it simply didn’t exist as you thought or didn’t contain anything. In the above case, the vector x does not have an “index 0” to subset (fun fact: in most other languages, the first position of a list-like object is position 0).

> object of type ‘closure’ is not subsettable

printer <- data.frame(size = c(1, 2, 3), color = c("red", "orange", "blue"))
printer[1,] # works fine
  size color
1    1   red
print[1,] # the print function is the here object of type 'closure'
Error in `print[1, ]`:
! object of type 'closure' is not subsettable

You wanted to subset an object, but you accidentally tried to subset a function; in this case, the print() function.

First steps when encountering an error message

Check the error message carefully! Sometimes the error message tells you which file and line number the error occurred on.

Debugging and sanity checks - useful before errors even occur

Use the scientific method

Go through everything - line by line. For each segment of code, form hypotheses as to what the output for a given input should be. This might quickly prove you wrong and expose disagreement between code purpose and function.

Advanced functionality: when you see an error

The call tree

Here’s some code. The functions are irrelevant, expect make_df() takes an input, passes something on to pass_on_df(), which (conditionally) calls the return_error() function. When you use make_df(), you might not realise all of this is happening behind the scenes.

make_df <- function(x) {
  df <- data.frame(y=x+10)
  pass_on_df(df)
}

pass_on_df <- function(x) {
  # ...
  if(x$y > 11) return_error(x$y)
}

return_error <- function(x) {
  stop(paste0(x, " not valid input"))
}

make_df(10)
Error in `return_error()`:
! 20 not valid input

When you see this error in RStudio, it looks as follows:

If you click “Show Traceback” on the right, you’ll see:

From bottom to top, you can see the order in which functions were called, until the error occurred. Now you know it wasn’t make_df() directly but rather something downstream called pass_on_df(). This may not always give an obvious solution, but at least it can help you find out which package/function you should be Googling to understand the error message.

Formal debugging tools

Below are a few highly related functions that can be useful for debugging. Sometimes you’ll see a button to “Rerun with Debug” right under the “Show Traceback” button we just discussed. Doing so sends you inside the working environment (‘the scope’) of the functions to “see what they see”. I.e., you see the data you put into the functions, and how these were manipulated at each step right until the error occurred. This can help reveal the cause.

browser()

Using this gives a similar experience; if you wrote a function yourself, you can write browser() anywhere within it to force a break in execution. It then allows you to inspect your function’s inner workings (and contents) up to and at that point.

debug()

this is useful for inspecting other people’s functions; it adds a “browser()” statement into another function for you (undebug() removes it)

breakpoints

In .R scripts, you can click to the left of the line-numbers to add a small red dot. This is like inserting a browser() statement there.

trace() / untrace()

Like debug() but can be used to insert any other code of your choosing into a function.

Warnings

Warnings are messages that don’t prevent your code from running. Treat these as you would errors, until you’re sure they’re harmless.

Because your code runs, they can be easy to miss; but usually they’re a package author’s way of letting you know you might not be using their package in the way it was intended;

They sometimes mean something horrible happened

Preparing for errors

try() is handy when you know some code can cause an error but you don’t want it to break everything; you want to just skip over it in that case:

log_ <- function(x) {
  try(
    return(log(x)),
    silent=TRUE
  )
  x
}
log_(2)
[1] 0.6931472
log_(1)
[1] 0
log_(0)
[1] -Inf
log_("a")
[1] "a"

This function takes any input and tries to take its log; if unsuccessful, it simply returns the input unchanged (yes, I know this is a dumb function).

When code runs forever

  • R code

If R never stops evaluting (e.g., stuck in an infinite loop), you can manually stop the process. This can be frustrating as you’re left with little idea of what went wrong.

  • Compiled code

If R creashes the moment you hit “Interrupt R”, the code was probably being run in another language (C/C++). There’s no way out but to restart and get to debugging.

More reading

Hadley Wickham’s tutorial on debugging