--------------050502040003050701020303 Content-Type: text/x-tex; name="ch06_implementation.tex" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ch06_implementation.tex" \chapter{Implementation Details} This Game is implemented in Javascript. Javascript is an object-oriented scripting language which is a descendent of ECMAScript standard. Its based on Java but runs on client-side web browser. As it runs locally, it enhances player interface and dynamic websites. Now we will look at some functions of actual code. \begin{itemize} \item populateConfig All the configuration values (e.g., number of causes, days until symptoms show) that the player chose in the initial screen are put into the 'Config' module. Those parameter values are displayed continuously during the whole game. {\em show screenshot with those parameter values} \item generateQuestions This function will create the main "survey" table each of whose entries consists of either a question mark, a y, or n. The generateQuestions function chooses random causes from all possible causes and sets the symptoms column to be equal to the OR of these chosen causes. It also generates the data for each subject exposed to all causes when the subject starts showing the symptoms. Finally, it chooses random positions other than in the symptoms column to put the desired percentage of '?'. {\em show screenshot with the various entries pointed out} \item displayTable The Main module described in chapter 4 is handled in this function. The displayTable module displays the main playing matrix (the "survey table"). Then it calls displayStat for statistics module. The on-click event is associated to each '?' and is attached to the updateStat function. If only one cause is involved, this function displays a correlation row instead of two-dimensional correlation table. If the number of causes is more than one, it passes control to the correlation module in order to display the correlation table. \item showCorrelationTable This generates a table in the 'correlation' module to show the correlation of any two causes (ORed together) with the symptoms, based only on the entries in the cause columns that are y or n. The 2 dimensional table is a square matrix consisting of all possible causes along the rows and columns. {\em show screenshot with the correlation table highlighted and a single entry explained} \item displayStat \begin{itemize} \item Bootstrap Bootstrapping is the process of estimating something about the whole population from a very small portion of that population. There are many kinds of bootstrapping, but for the purposes of this game, we use sampling with replacement. \begin{figure}[ht] \centering \fbox{ \begin{minipage}{13 cm} \includegraphics[width=1.0\textwidth]{bootstrap.png} \caption{Bootstrapping} \end{minipage} } \end{figure} Let's take the example of tap water in the game scenario of the figure. Now there are 2 pairs which show the relationship between Tap water and symptom. Those four data points (two per pair) give us the "measured correlation." But we need to infer the relationship when all the 5 pairs are revealed (i.e. all questions about Tap water are answered). Thus we choose 'with replacement' 5 pairs from these 2 available pairs. And find the correlation of this new column. We repeat this procedure a fixed number of times for every column (in our case its 1000 times). And hence we get 1000 correlations. \item Confidence Interval (involves merge sort) We use the 1000 correlations found in bootstrapping to figure out the range of correlations that are statistically consistent with this data. We sort all these 1000 values and take the middle 95\% of the values. The measured correlation will normally lie inside this interval. The smaller the confidence interval, the better are the chances that the measured correlation is close to the true correlation. \end{itemize} \item updateStat Whenever the player clicks on any Question mark in the survey table, this function is called in the background. This function displays the answer and recalculates both single and (when appropriate) pairwise correations. The function also increases the time value and checks whether a day has passed or not. If yes, it changes any of the symptoms values that may now be revealed (when the delay factor is greater than 0). Once these symptoms have changed, it calculates all the statistics again. \item guess This function is called when the player clicks on the name of a cause. It checks whether the cause chosen by the player is in the 'correctcause' array. If yes, it makes that button green, else that button becomes red. In any case, if the game still needs to be played (not all causes have been found), it updates symptom values if necessary to reflect the day that has passed. When symptom values are updated, this function also updates correlations. \end{itemize} --------------050502040003050701020303 Content-Type: text/x-tex; name="ch07_userTests.tex" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ch07_userTests.tex" \chapter{User Tests} We conducted user tests on various users. {\em How many?} Few {\em How may?} of them had technical background like Computer Science, few of them {\em How may?} worked outside of the mathematical sciences. Few {\em How many?} were statisticians or mathematicians. Each user played 5 games of increasing levels of difficulty. The first level had only one cause and 0\% of '?' with 0 delay. This was to give an idea of what actually needs to be done in the game. Only about 5 users {\em Out of how many?} were able to understand what needed to be done and how to go about finding the cause. The second level had one cause again with 0 delay but this time 70\% of the entries in the survey table had question marks. By this time everybody knew what needs to be done. This level introduced the concept of asking 'questions'. Almost all {\em How many?} took a guess without eve asking question and almost all again got them right. But those who did not learned that the correlation of few data points might be misleading. In the third level, we introduced delay factor and taught users how those might affect statistics. We had to explain the delay concept here because 50\% of the users thought that a subject might be exposed to a cause within this delay. In the fourth level, we made delay 0 again but had 2 causes to find out. Users got the idea of 'OR'ing the cause but only 2 of them actually got the strategy of using 'n's in symptoms. For the others, we gave some hints. In the fifth level, {\em What were the conditions?} Users could play this level without guidance and reported enjoying the game a lot. Some users {\em How many?} guessed a cause without any strategy. Others {\em How many?} tried to rule out the cause pairs based on column 'OR'. Their reaction and thinking process was recorded in an mp3. {\em It would be great if the mp3s could be transcribed into the appendix. Each user's comments associated with each level. Of course, no names} After playing, each of then was asked following questions : \begin{itemize} \item How did you decide the cause? \item At what point of the day do you generally guess a cause? \item At what value of statistics shown do you decide the cause? \item How do you use confidence interval? \item When your are given 2 causes, do you use bottom table? \item Do you use p-value? \item What improvements can be done? \item Feedback- what do you think about the game? \end{itemize} Our goal was to test what learning had taken place through the game playing, whether a user could understand the concept of p-value or confidence interval. Many Users did not use either p-value or confidence interval. They did however use the correlation. We have since removed the p-value, because it seems so confusing in the context of a column with few answers. One user asked to change the game so that there will be more use of p-value and confidence Interval in the game. One user suggested to have longer buttons for question marks while other suggested to add mouse animation when hovered over a question mark. Nobody liked the apparently theoretical nature of the help file. Suggestions were to add snapshots and videos to explain what's going on. {\em we should do this} Users had suggestion regarding the graphical interface as well. Many did not like the idea of frames and scrolling up and down every now and then to see a certain entry. They also suggested to have a pop-up when mouse hovers over certain place. Some suggested to have lower bound on how bad a user can do e.g. not allowing a user to take more than 4 days to find out answer. Some also suggested that we should display a timer ticking as soon as the game starts. That way, even if a user might not ask any question, there will be a sense of urgency. Also after testing, we sensed a need to tell people that this combination is easiest. Thus, along with 'start game' button, its better to have buttons like 'Level 0', 'Level 1' and so on at least till 'Level 5'. {\em we should do this} --------------050502040003050701020303 Content-Type: text/x-tex; name="conclusion.tex" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="conclusion.tex" \chapter{Conclusion} \section{Evaluation} The game tries to teach statistics which it does pretty well. People use the displayed statistics to play the game. They make useful guesses based on statistics. \section{Related Work} \begin{figure}[ht] \centering \fbox{ \begin{minipage}{13 cm} \includegraphics[width=1.0\textwidth]{woods.png} \caption{Related Games : Woods} \end{minipage} } \end{figure} There have been quite a few games which try to explain some statistical concepts through simple animation. For example the University of Reading has developed a very simple animated excel based game to explain concepts like sampling with ratio, concept of variation. They also tell you about ideas of estimation and the role of standard errors in estimation. The figure above shows one such example called Woods. In this, every plot has a certain number of small/big trees. We are supposed to choose a few plots which will make up the same ratio of small/big trees as in total plots. There are similar games teaching different statistical concepts. For example, the Game Tomato helps understand the issues involved in experimental design while the game Mice teaches design of multi-stage survey. {\em These need to be explained more with screen shots. Need to compare with our game.} \section{Future Work} This work is taking a whole new dimension in statistical games. It generates its own data and then lets user play around with that data. It tries to make active the concepts of correlation and bootstrapping. Of course, any concept can be taken and put it into real life situation to make that concept easier to understand. Talking about this specific game, the user interface has to be made more intuitive and more fun.