JULY 19, 2023

I HAVE MOVED THE CONTENTS OF THIS SITE TO A NEW DOMAIN-NAME.

I WON'T BE UPDATING THIS OLD ONE ANY MORE.
I AM LEAVING THE PAGES 'AS IS' FOR NOW, BUT
I WILL BE GRADUALLY REMOVING THE LARGE FILES.

SO BEST TO VISIT THE MATERIAL AT ITS NEW DOMAIN-NAME:

                http://jhanley.biostat.mcgill.ca/software

J Hanley

Software, etc.



case-base

R package [2017] to create a 'case-base' dataset that allows

Fitting of Smooth-in-Time Prognostic Risk Functions via Logistic Regression

described by Hanley JA and Miettinen OS in The International Journal of Biostatistics, Vol 5, Issue 1 2009).

and in these two presentations U.Guelph and ISCB

This zip file
includes

-- (older, 2009) R code and dataset analyzed in 2nd edition of textbook by Collett;

-- SHEP dataset, and simpler R code that uses the 2017 R package, to reproduce the results given in the Hanley-Miettinen article;

-- article "Profile-specific survival estimates: Making reports of clinical trials more patient-relevant" ( Julien and Hanley, Clinical Trials 2008; 5: 107-115) that uses SAS , R and Stata to produce 'not-smooth-in-time' hazard functions and profile-specific prognostic probabilities.



Excel spreadsheet and R code to annimate a 'statistical hammock' illustrating the effect of collinearity (correlation) on fitted regression coefficients .
 
LINKS:   Excel spreadsheet     R code



R code to extract underlying data from Kaplan Meier and Nelson-Aalen curves along with some worked examples.
 
LINK



Applet (Flash) to illustrate different fitting methods and different model assumptions for a very small dataset with 2 datapoints and 1 parameter.

One has 2 independent observations from the (no-intercept) model

          E[y|x] = mu_{y|x} = beta times x.

The y's might represent the total numbers of typographical errors on x randomly sampled pages of a large document, and the data might be y=2 errors in total in a sample of x=1 page, and y=8 errors in total in a separate sample of x=2 pages. The beta in the model represents the mean number of errors per page of the document.
Or the y's might represent the total weight of x randomly sample pages of a document, and the data might be y=2 units of weight in total for a sample of x=1 page, and y=8 units for a separate sample of x=2 pages. The beta in the model represents the mean weight per page of the document.

We gave this `estimation of beta' problem (x,y)=(1,2) & (2,8) to several statisticians and epidemiologists, and to several grade 6 students, and they gave us a variety of estimates, such as beta_hat = 3.6/page, 3.33/page, and 3.45! See WHY by clicking at various locations to try out various slopes:

    applet



Computer code to simulate datasets with measurement error and look at sampling distribution of parameter estimates

    temperatures measured with error   SAS     R

Animation (in R) of effects of errors in X on slope of Y on X

    R code for 'animation' package     Completed animation(.pdf)

Article in 'Research methods & reporting series'. J Hutcheon, A Chiolero and J Hanley. BMJ 2010;340: 1402- 1406. c2289, doi: 10.1136/bmj.c2289 (Published 23 June 2010)

    Random measurement error and regression dilution bias (,pdf)



Links between Poisson, Gamma and Chi-sq distributions -- Fisher1935.

One must travel 7500 Km by 4-wheel jeep, over very rough terrain, with no possibility of repairing a tire that becomes ruptured.

Suppose one starts with 14 intact tires (the 4, plus 10 spares).

On average, tires rupture at the rate of 1 per 5,000 tire-Kms (the mean interval between ruptures is 5,000 tire-Kms). Ruptures occur independently of the of tire position or the distance already driven with the tire (i.e., the sources of failure are purely external). Ignore the possibility of multiple failures from a single source, e.g. a short bad section of the trail.

Suppose one starts with 10 intact tires (the 4, plus 6 spares).

    R code     Java Applet



SAS code [including link to dataset] for bootstrap standard error of estimate of First Principal Component: Appendix to article "Creating non-parametric bootstrap samples using Poisson frequencies" in Computer Methods and Programs in Biomedicine. 2006 Jul;83(1):57-62; authors: James A. Hanley and Brenda MacGibbon.



SAS implementation of the 'placement' or 'U-statistics' method described in Hanley JA and Hajian-Tilaki KO. Sampling Variability of Nonparametric Estimates of the Areas under Receiver Operating Characteristic Curves: An Update. Academic Radiology, 1997 4:49-58.     SAS Program


Article by Hanley and Hajian-Tilaki. Sampling Variability of Nonparametric Estimates of the Areas under Receiver Operating Characteristic Curves: An Update. Academic Radiology, 1997 4:49-58.     pdf




Article by Hanley and McNeil "The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve." Radiology 1982: 143: 29-36.     pdf

Article by Hanley and McNeil "A Method of Comparing the Areas under ROC curves derived from same cases." Radiology 1983: 148: 839-843.     pdf

Appendix to Hanley and McNeil Radiology "A Method of Comparing the Areas under ROC curves derived from same cases." Radiology 1983: 148: 839-843.     pdf


Article by McNeil and Hanley "Statistical Approaches to the Analysis of ROC curves." Medical Decision Making 1984: 4(2): 136-149.     pdf



Updated: March 29, 2018