This technical report summarizes work done to develop asymptotic tests for location, variance, and equivalence of gene expression between two groups of microarrays in order to find an alternative to permutation tests based on the pairwise distance matrix. Given additive errors, the location test is not asymptotically normal without further assumptions. The variance test and equivalence test are asymptotically normal.
R code for implementing survival analysis of longitudinally collected gene expression data. Methods are described in the linked paper by Rajicic N, Finkelstein DM, and Schoenfeld DA, submitted to Bioinformatics, May 2006.
This poster, prepared for the MGH Clinical Research Day 2005, discusses some important considerations regarding the preparation of deidentified datasets. It is a requirement of most NIH-sponsored trials that data be provided for distribution to qualified investigators. Studies involving human subjects face the challenge of preparing a dataset that is both useful and protects patient confidentiality. This is a new area, with few established guidelines; the poster looks at some specific issues that have arisen in the context of dataset preparations done for ARDS Network studies.
This paper proposes an approach to adjust analyses of family studies for complex ascertainment schemes where the sampling is dependent on the disease history of the entire family. This approach extends that of Tosteson et al. 1991 to handle these types of sampling schemes.
We conducted an exploratory analysis of co-aggregation of cancers in individuals and families utilizing sibships from over 18,000 families who had been recruited to the registry of the NCI-sponsored multi-institutional Cancer Genetics Network. We found statistically significant familial co-aggregation of lung cancer with pancreatic (p<0.0001), prostate (p < 0.001), and colorectal cancers (p=0.003). In addition, we found significant familial co-aggregation of pancreatic and colorectal cancers (p=0.022), and co-aggregation of hematopoietic and (non-ovarian) gynecologic cancers (p=0.01)
micaParalize is a set of functions that allow a user to easily run normal Matlab functions on a multi-processesor machine by dividing the work-load amongst the serval processors. The program is used for simulations, bootstraps and function maximization problems in biostatistics.
micaParalize has been superceeded by biopara, which is on the software page
1s_logrank.xls is for computing one sample log rank test, confidence intervals for the SMR, calculating estimate for survivorship in the matched standard population and visually comparing survivorship of the sample to that of the standard population as described in the paper and instructions (both included in the zip file). The paper was published in the Journals of the National Cancer Institute, Vol. 95, No. 19, Oct 1 2003 pp. 1434–1439 as a commentary.
"Analysis of Failure Time Data From Screening Studies With Missing Observations" was delivered by Dianne Finkelstein, Ph.D. at the 2003 Joint Statistical Meetings in San Fransisco.
depcen.exe is a program for estimating survival probabilities and probabilities of attending visits as described in the paper "Analysis of Failure Time Data with Dependent Interval Censoring" (Finkelstein D.M., Goggins W.B, and Schoenfeld D.A., Biometrics 2002 58:298-304). The program was implemented in Matlab and runs as a batch job from a DOS command prompt. The time to blood shedding data from the paper is also included. "interval_censr_data.zip" contains the data in .dat format and the .sas file required for setup. When using this data, please reference the article cited above.
Gen.m is an m-file (Matlab/Octave) for for computing sequential boundaries, as descibed in the paper "A Simple Algorithm for Designing Group Sequential Clinical Trials" (Schoenfeld, Biometrics 57, 972-974; September 2001). If you have Matlab download gen.m, gen.m can also be run under Octave a public domain m-file interpreter which can be downloaded from the URL below. In addition sequential.zip contains a compiled version of gen.m which runs on the command line.