BIOS601 AGENDA: Thursday September 05, 2013

[updated August 19, 2013]

Agenda for Thursday Sept 05, 2013

Discussion of computing and statistical inference issues in the assignment on sampling of locations on Earth's surface

answers to be handed in for Q1, Q2, Q3, Q4, Q5

The first (general) computing issue is (if need be) to get up to speed in the use of R. See the R links on the main course page. If you run into problems, let JH know asap.

A statistical/computing issue might be how to come up with a way to randomly sample locations on the surface of a sphere, using latitude and longitude co-ordinates. See the notes at the bottom of the file containing the 2 R functions inside the Oceanography link (on the height of the land and the depth of the ocean) inside the resources for surveys. JH thinks of the problem by visualizing the segments of a peeled orange!

Remarks: The statistical issues raised by this assignment include the distinction between standard deviation and standard error; the concept of a margin of error; when it is appropriate to use the Normal (Gaussian) approximation to the binomial distribution; the (often under-appreciated) centrality of the Central Limit Theorem (CLT) in applied statistical work, not just for the sampling distribution of a sample proportion, but also for that of a sample mean.
Discussion of issues in the Assignment on measurement

Q1 and Q2 (measuring 'Readability'): answers need not be handed in; just think about the issues; If there is time, we might discuss and do some 'measuring' in class.

Q3, Q4, Q5, Q6, Q7, Q8: Answers to be handed in.

Q9, Q10: from last year; answers need not be handed in. If there's time, we will think about what the answers might have looked like.

Remarks: this topic of measurement is probably new for you, as it was for JH when he began in cancer clinical trials in 1973, and oncologists (cancer doctors) were judging responses of advanced cancer to chemotherapy by measuring tumours by 'palpation'.
Just because (random) measurement errors tend to cancel out in averages doesn't mean that errors in measurement can be ignored. For example, how comfortable would you be in measuring how much physical activity JH does by having him wear a 'step-counter' for a randomly selected week of the year, and using that 1-week measurement as an 'x' in a multiple or logistic or Cox regression? See slides 7 and 8 from part of JH's "Scientific reasoning, statistical thinking, measurement issues, and use of graphics: examples from research on children" at Royal Children's Hospital in Melbourne, earlier this year. pdf

Some of the the terminology will be new to you, and so (as you will discover in your simulations of how well you can estimate the conversion factors between degrees F and degrees C) will some of the consequences of measurement error. The "animation (in R) of effects of errors in X on slope of Y on X" might be of interest, as might the java applet accompanying "Random measurement error and regression dilution bias". These consequences are rarely touched on, yet alone emphasized, in theoretical courses on regression, where all 'x' values are assumed to be measured without error! Welcome to the REAL world.

For this exercise, and the topics it addresses, the most relevant portions of the 'surveys' resources are Measurement: Reliability and Validity and Effects of Measurement Error