Last time we introduced Expected Value theory and modeled how it might behave on the survey. Today you are going to explore your own data (assuming you did the survey) as well as data from other students who have taken this class.

To get the data you need to download the file

riskyChoiceData_2020.mat

from D2L.

Once you've downloaded the data make sure you move it into the directory you are writing your scripts in for this week (i.e. the Week_07 directory in your NSCS folder on Dropbox).

Then, from a script, you can load the data like this

clear

load riskyChoiceData_2020

This loads a bunch of variables into Matlab. Let's take a look at what we've got using the "whos" command

whos

We've got 5 variables here

- QUS - a "cell" array of strings with the text of each question in the survey
- P - a vector of 12 probabilities for the risky option in each question
- V - a vector of the 12 winning amounts for the risky option in each question
- rsk - a matrix of people's responses to each of the 12 questions. 1 denotes a risky choice, 0 denotes a safe choice.
- BR - a vector containing the blink rates of all the people in the data set in blinks per minute

Note: Because I am making this example before this year's class completed the survey, I have fewer participants in my data set (146) than you will have.

We can explore these variables more in the Command Window. Let's start with the text of the questions, which we can get by typing ...

QUS'

Each line here corresponds to the text of the risky option in each question (e.g. question 1 had a risky option of 50% chance to win $20). Remember that the safe option in this survey was always $10 for sure.

If you want to look at the text of just one question, say question 4, you can type

QUS{4}

Note that you have to use curly brackets { and } with cell arrays.

We're not actually going to do any more with QUS - it's mainly in here just to reorient you to the survey and in case you want to refer to the questions themselves.

Next let's take a closer look at P and V ...

P

V

P is a vector of 12 numbers telling you the probability of winning in risky option for each question. So

P(4)

ans = 0.5800

consistent with a 58% chance on question four.

V is a vector of 12 numbers telling you the value of winning for the risky option. So

V(4)

ans = 11.9300

consistent with a winning payout of $11.93 on question 4.

Using P and V we can compute the Expected Value of the risky option for each question. One way to do this is directly with a for loop

for i = 1:length(P)

EV_risky(i) = P(i) * V(i);

end

Or you could do it using element-wise multiplication

EV_risky = P.*V;

Or you could reuse your EVtheory_survey function from last week. If you want to do this be sure to copy and paste the function into your directory ...

for i = 1:length(P)

[EV_safe(i), EV_risky(i), choice(i)] = EVtheory_survey(10, P(i), V(i));

end

This approach also gives you the choice that pure EV theory would make for "free."

Let's see what those Expected Values are

EV_risky'

Note: I used the transpose here to transform EV_risky from a row vector into a column vector for easier viewing.

To get an even better idea about EV_risky let's plot it as a function of question number

clf;

plot(EV_risky, '.', 'markersize', 50)

xlabel('question number')

ylabel('EV_risky', 'interpreter', 'none')

set(gca, 'fontsize', 18)

This reveals the design of the experiment. I started with an "easy" question

QUS{1}

Which has simple numbers and where the Expected Value is just 10. Then I used more complicated questions like

QUS{2}

These questions were designed such that EV_risky would go from about 5 to about 15 in steps of 1. Hence the (very near) linear increase in EV_risky with question number from question 2 to 12 in the plot.

Why did I design the experiment like this? Well, I wanted a range of differences between EV_risky and EV_safe (which remember is always 10). This will allow us to compute the choice curve for people just like we computed the choice curve for the model last week. However, before we can compute the choice curve, we need to take a closer look at the choice data ...

The actual choices for each person on each question are in the matrix rsk. For me this matrix has size 146 x 12, meaning that there are 146 rows (1 per subject) and 12 columns (1 per question).

Note you will have more subjects in your data set than I do because it will also include this year's participants.

So if I look at rsk(103, 10) I get the choice of subject number 103 on question 10

rsk(103,10)

ans = 1

A value of 1 indicates that this person chose the risky option on this question.

If we want to look at all the choices made by participant 103 we can write

rsk(103,:)

which gives us a row vector of length 12.

Note the special use of the ":" here it's saying "take the whole of row 103."

If we want to look at an entire column (i.e. all the responses to one question, say question 5) we can write

rsk(:,5)

Which gives us a long column vector containing one entry per participant.

Finally, we can view the whole matrix by just typing in

rsk

However, as nice as it is to look at individual raw data points (and I always recommend doing this if you have data of your own - for example in your project for this class) it would be nice if we could visualize things a bit better. One way to do this is to make an image of the matrix like this ...

imagesc(rsk)

xlabel('question number')

ylabel('participant number')

set(gca, 'fontsize', 18)

In this plot we get to see the whole matrix at once. Question number is on the x-axis and subject number is on the y-axis. 1s (risky choices) are yellow and 0s (safe choices) are blue. There's definitely some structure here - there's more yellow on the right of the plot and more blue towards the left. We'll explore this in more detail in a moment, but first a ghost story ...

It turns out that Matlab is haunted by the ghost of a child. You can see this for yourself by typing imagesc without any input ...

clf;

imagesc

In fact it's even spookier than that and there are actually all sorts of creepy dogs, people, and objects hidden in this image at different scales ... here's a gif I made of some of them ...