Day 31-34: Belief elicitation with a Shiny app

Hi. My name is Dani and it’s been a four days since I blogged about a new R stats package. So, um, if I’m going to meet my “100 packages in 100 days” target, I’m going to have to cover four packages in the one post. A little tricky if my goal here was to actually describe four packages in full, but fortunately I have something different in mind. I have a concrete research problem I need to solve, and I think I can do it with four packages.

This is a fairly long post, and I’m a bit too tired today to find fun pictures to go with it. So the short version is that I worked out how to build this shiny app for a belief elicitation task.

The problem

The problem is a variation on the one I mentioned in my Stan post earlier this month. It’s essentially an elicitation task. My colleague has a set of 9 categories that all fall (roughly) on a single continuous dimension, and we want to elicit people’s beliefs about the probability that a new observation falls within that category. The goal in this project is not to study whether people are any good at providing probability judgments (generally we’re not); rather, the hope is to sidestep as many of the traditional cognitive and decision making biases and elicit something akin to a subjective probability distribution defined over these 9 categories. One approach we are considering is to construct a simple web application that provides people with some simple controls they can play with to manipulate a probability distribution until they think it represents their beliefs. I’m not entirely sure whether this is the best elicitation method, but first things first: lets see what we can create!

The packages

  • shiny. Obviously! If I’m going to create a web application using R, where else would I start?
  • shinyjs. Very handy set of tools for adding useful bits of client-side javascript into a Shiny app
  • sn. Functions to provide a more flexible family of distributions.
  • rdrop2. Have the Shiny app upload data files to my Dropbox account.

Background

I happily admit that I’m still learning how to use shiny. I learned the basics of it earlier this year in order to convince myself that - at least in principle - it would be possible to implement a psychology experiment within a shiny app, which led to this mimimal “blankshiny” template. In that app, I assumed that each trial would display text within a content area, the user would manipulate a two-button slider, and then click the action button to submit their response. Upon doing so the shiny server would record the state of the slider and update the text for the next trial. My goal here is to allow interactivity within a trial: replace the text area with a plot, and update the state of the plot whenever the sliders are adjusted. So here goes…

Step 1: Structure of the application

The way I ended up structuring the application was to break it into four scripts. First there’s the app.R script which loads the packages, sources the other three scripts, and then starts the application running:

# load packages
library(shiny)   # the shiny package
library(rdrop2)  # needed to save data to dropbox
library(sn)      # need to draw the figure
library(shinyjs) # shinyjs is handy

# structure of the application
source("makeUI.R")       # construct the user interface
source("makeServer.R")   # construct the server function
source("flowControl.R")  # how does the experiment progress (and get saved)?

# create the shiny app
shinyApp(ui, server)

The division of responsibilities for the three scripts is as follows:

  • makeUI.R defines the user interface from the participant’s perspective. It lays out the page, specifies what objects should exist etc.
  • makeServer.R defines the server-side logic “within” a trial. It draws graphs and responds whenever the participant takes action
  • flowControl.R handles all the other server-side stuff. It handles transitions between trials, records data and then writes the data to file.

Step 2: What goes on the page?

Okay, let’s get started on the makeUI.R script. The first thing I need to include in this file is a set of R variables corresponding to the object that the webpage is going to have to display. It’s a simple app, and it only has three objects:

  • myContent is the area where the app will draw plots, so that’s defined using the plotOutput function
  • myResponse is the slider bar that the user can play with, so I define it with sliderInput (specifying two values to ensure that I get a two-button slider)
  • myAction is the button participants click to end the trial, and because I want it to keep track of the number of times it’s been clicked (thereby serving as a trial counter) I define it using the actionButton function.

So I get this as the first part of the makeUI.R script:

myContent <- plotOutput( 
  outputId = "content",
  width = "100%",
  height = "400px"
  )

myResponse <- sliderInput(
  inputId = "response", 
  label = "", 
  min = -4, 
  max = 4, 
  step = .25,
  ticks = FALSE,  
  value = c(-1, 1),
  width = "90%"
  )

myAction <- actionButton(
  inputId = "action", 
  label = "submit"
  )

Note that each of these has either an inputId or an outputId value, used by the server to control or respond to these objects. For instance, the slider has inputId = "response", so on the server side the slider is input$response. Similarly, the content area has outputId = "content" so the server refers to this object as output$content.

Step 3: Organising the controls on the page

A shiny app needs to specify the user interface, by indicating how these objects are to be organised on the page. So the second half of the makeUI.R script defines the ui object that tells the shiny app how these objects will be laid out on the page.

At the time I built my first shiny app I didn’t really understand what it was doing, because shiny uses bootstrap templates to construct the web page itself, and I’d never used bootstrap before. Since then, however, I’ve rewritten my lab website using bootstrap, so it now seems a little less like magic. As it happens though, the app that I need is really simple from a user interface point of view. It just has one column with three objects stacked vertically: the myContent object (which will be the plot), the myResponse object (the slider) and the myAction object (the submit response button).

So we define the user interface variable ui as follows:

ui <- fluidPage( 
  fluidRow( # the page has one row...
    column( # ... and the row has one column
      
      # information about the column
      width = 8,         # specify the column width (between 1 and 12)
      align = "center",  # how to align content inside the column
      
      # study-specific objects (see the "study.R" file)
      myContent,  # content area
      myResponse, # response area
      myAction    # action button
    
    )
  )
)

Okay, well, not quite. It’s a psychology experiment so there’s going to be a sequence of trials, at which point it will end, and I’m going to want to hide all the user interface elements and display a simple “thank you” message. For now, I’m just going to do that client side, so I’ll use the shinyjs package to insert an intially-hidden thank you message that can be revealed later. So my user interface object is actually defined like so:

ui <- fluidPage( 
  fluidRow( # the page has one row...
    column( # ... and the row has one column
      
      # information about the column
      width = 8,         # specify the column width (between 1 and 12)
      align = "center",  # how to align content inside the column
      
      # study-specific objects (see the "study.R" file)
      myContent,  # content area
      myResponse, # response area
      myAction,   # action button
      
      # hidden thank you message
      hidden(p(id="thankYou", style="font-size:200%", "All done! Thanks")),
      
      # force the ui to use shinyjs
      shinyjs::useShinyjs()
      
    )
  )
)

Step 4: Server logic

In my original app, I wrote the server logic on the assumption that the server would only ever have to respond when the user clicks the action button (i.e., submits the response). That makes sense for many psychology experiments where the stimulus is a static object, I’m going to have to make some changes. Previously, I’d defined the server like this:

server <- function(input, output) {
  
  # read the stimulus materials
  study <- myStimuli()
  
  # how to handle the end of the current trial?
  observeEvent(
    eventExpr = input$action, # when user clicks the action button
    handlerExpr = endTrial(input, study) # end this trial
  )
  
  # how to generate the next trial?
  output$content <- renderText(
    expr = processContent( study[[input$action+1]] )
  )
  
}

This server logic has three parts:

  • When the server is defined it reads all the stimulus materials for the study by calling the the myStimuli() function, which is expected to return a list such that study[[i]] contains all the information relevant to the i-th trial.

  • It defines only one kind of reactivity, defined by the call to observeEvent: whenever the user clicks on the action button (eventExpr = input$action) the trial ends, and the server responds by calling the endTrial function, which is defined as part fo the flowControl.R script

  • It generates the output$content (i.e., writes to the myContent part of the UI) by calling the processContent function, using information from the current trial to do so. This works because the value of an action button starts at 0 and increments by 1 every time it is clicked, which makes it ideally suited to keep track of the trial number. In other words, study[[input$action+1]] contains the information relevant to the next trial

That server logic is fine for the most part, but it’s not going to work perfectly for the application I want. In the current form, output$content only depends on input$action, so the server won’t be triggered to do anything unless the user clicks the action button. To fix that, and make sure the server updates the content whenever the slider state changes, I’ll add a dependency on input$response:

server <- function(input, output) {
  
  # read the stimulus materials
  study <- myStimuli()
  
  # how to handle the end of the current trial?
  observeEvent(
    eventExpr = input$action, # when user clicks the action button
    handlerExpr = endTrial(input, study) # end this trial
  )
  
  # how to generate the next trial?
  output$content <- renderPlot(
    expr = processContent( study[[input$action+1]], input$response) 
  )
}

I’ve also switched from renderText to renderPlot when specifying how the server should render the content, because for this app I want to draw pictures rather than write text

So, to complete the server I need to define three functions:

  • myStimuli() defines the stimuli
  • processContent() tells the server how to update the plot
  • endTrial() tells the server how to handle an end-of-trial event

Step 5: Defining the stimuli

At the moment I’m not too interested in what the stimuli will be, so all I’m going to do is return just a dummy list,

myStimuli <- function() {
  stimuli <- list(
    "Trial 1 of 3",
    "Trial 2 of 3",
    "Trial 3 of 3"
  )
  return(stimuli)
}

and the only thing we’re going to use that for is to write the title to the plot!

Step 6: Telling the server what plot to draw

Next I need to define processContent, which is the function that the server will call every time it needs to redraw the figure. It takes information about the stimulus (used as the plot title), the params vector that contains information about the state of the slider bar, and a dummy plot variable that specifies whether it’s actually supposed to draw the plot…

processContent <- function(stimulus="", params=c(0,1), plot=TRUE) {

  # parameters
  xi <- mean(params[1:2])  # location
  omega <- (params[2] - params[1])/2  # scale
  
  # don't let the variance be zero
  if(omega == 0) omega <- .1
  
  # use a binned, truncated skew-t distribution with heavy 
  # tails (nu = .5) to map the slider state onto a distribution
  x <- -4.5:4.5
  nbins <- length(x)-1
  p <- pst(x, xi, omega, alpha = 0, nu = .5)
  p <- p[-1] - p[-(nbins+1)]
  p <- p/sum(p)
  x <- x[-1] - .5
  
  # if requested draw a barplot
  if(plot){
    par(mar=c(3,0,3,0))
    barplot(p, ylim=c(0,.8), axes = FALSE, col = "black",
            names.arg =  LETTERS[1:nbins], cex.names = 2, 
            main = stimulus)
  }
  
  # return the probabilities of each bin
  return(p)
  
}

I’ve used the pst function in the sn package to calculate the cumulative distribution function for the skew-t distribution. It’s a little overkill, since I’m currently not using the skewness parameter for anything, but at some point I’d like to extend the UI to allow that.

In any case, that completes the makeServer.R function! Only one to go!

Step 7: Flow control

The experiment that I’ve defined here is very simple. All it does is run three trials, getting the user to do something with the slider in each case, and then ends. So the flow control for the study is pretty simple. All we have to do is record the state of the slider at the end of each trial, reset the slider, and display the next stimulus. If we’ve reached the end of the experiment, we need to hide all the controls and say thank you to the user:

# ---- handle end of trial events (incl. store data) ----

# initialise the data set
responses <- data.frame() 

# handler for the end-of-trial event
endTrial <- function(input, study) {
  
  # trial number
  trialNum <- input$action
  
  # get the data for this trial
  trialData <- getTrialData(input, study[[trialNum]]) 
  
  # update data storage (this is not an efficient mechanism!)
  if(trialNum == 1) {
    responses <<- trialData
  } else {
    responses <<- rbind(responses, trialData)
  }
  
  # end the experiment if needed
  if(trialNum == length(study)) {
    
    # tidy up
    hide("response")
    hide("action")
    hide("content")
    shinyjs::show("thankYou")
    
    # save data 
    saveData(method = saveMethod)
    
  } else {
    
    # before each trial, use shinyjs to reset the response object
    reset("response") 
    
  }
}

For this to work, we need two more functions:

  • getTrialData() should return a data frame containing information about the current trial
  • saveData() should write data to file

Step 8: Specifying what to record about the trial

getTrialData <- function(input, stimulus) {
  
  trialData <- data.frame(
    trial = input$action,      # the trial number
    question = stimulus,       # what question was posed? 
    lower = input$response[1], # response (lower value)
    upper = input$response[2]  # response (upper value)
  )
  
  # get the actual bin probs
  p <- processContent(params = input$response, plot = FALSE)
  nbins <- length(p)
  varnames <- paste0("p",1:nbins)
  trialData[varnames] <- p
  
  return( trialData )
}

Step 9: Saving data

Back when I started playing around with shiny, I discovered the rdrop2 package that is super-handy for allowing your app to upload the data files to dropbox. This post is getting long, so here’s the code I used to save the data:

# ---- information needed for saving data ----

saveDir <- "path/to/storage/location"
saveMethod <- "dropbox"


# ---- saving data  ----

saveData <- function(method="none") {
  if(method == "dropbox") saveDropboxData(saveDir) 
  if(method == "local") saveLocalData(saveDir) 
}

saveLocalData <- function(outputDir){
  fileName <- makeFileName()                 # file name
  filePath <- file.path(outputDir, fileName) # path to file
  writeData(filePath)                        # write file
}

saveDropboxData <- function(outputDir) {
  fileName <- makeFileName()                 # file name
  filePath <- file.path(tempdir(), fileName) # temporary directory
  writeData(filePath)                        # write to temp directory
  drop_upload(filePath, path = outputDir)    # upload to dropbox
}

makeFileName <- function() {
  timeStampString <- format(Sys.time(), "%Y-%m-%d-%H-%M-%S") # get system time
  fileName <- paste0("data-", timeStampString, ".csv")       # timestamped file name 
  return(fileName)
}

writeData <- function(filePath) {
  write.csv(
    x = responses,     # the data
    file=filePath,     # where to write it
    row.names = FALSE  # don't add row names to the csv
  )
}

Publishing

At this point I have my four scripts: app.R, makeUI.R, makeServer.R and flowControl.R. From here it’s straightforward. I source the app.R file, click on “Publish” to upload the app to http://www.shinyapps.io/, and we’re done!

Yay!

Issues

  • The app isn’t very responsive. That is probably not a problem for a simple experiment (we can explain to people that they need to wait a second for the distribution to redraw) but it’s not ideal
  • I worry about scalability. If we run this as a study on Amazon Mechanical Turk then we’ll have the problem of many users hitting the site at the same time. I don’t know what’s happening server side, but I have a horrible feeling I’m going to need to learn something about load balancing (which I’ve never had to do previously because my studies are almost exclusively client-side and Google App Engine handles that stuff seamlessly anyway)
  • How do I get rid of those annoying numeric labels from the slider bar? They don’t really map onto anything meaningful and I think they’re distracting
  • Is there a way of defining a three-point slider bar in Shiny? I played around with having the skewness on a separate one-point slider but it’s counterintuitive. I feel like it makes more sense to have three points on the one slider and then map those locations onto a distribution.
  • How much does my decision to restrict it to unimodal heavy-tailed distributions matter? In general it’s probably dubious but in the research context unimodality is probably fine, and at the moment I don’t want to induce any weirdness by having some of the bars drop to near-zero in the plot. I think people might find that odd. But that’s purely an intuition, and I don’t really know this literature well enough to feel confident in my choices.
Avatar
Danielle Navarro
Associate Professor of Cognitive Science

Related