Cox Regression Modeling and Analysis of Decompression Sickness Data
DECOMPRESSION
SICKNES (DCS) DATA acquired experimentally in hypobaric chambers by the scientists
at Johnson Space Center consist of a fairly large number of records and have been
satistically analyzed by the JSC Environmental Physiology Group. Among other approaches,
the survival analysis technique has been used for modeling these data. It shows that the
DCS time, when transformed into logarithmic scale, appears to follow a normal or a
logistic model. The model-fits have been evaluated in terms of their likelihood values
resulting from the use of DCS data, as well as certain other statistical criteria.1
In the last two years, UHCL researchers have analyzed a subset of these data, consisting of 1322 observations of which 1154 are censored. An observation is termed censored if a test is terminated before the DCS symptom would occur. The mode-fits were further evaluated for the lognormal and log logistic distributions using the most recently developed chi-square-goodness-of-fit tests.
INSIDE THE CAPSULEUH PI Raj S. Chhikara (right) enters a simulator with NASA post-doctoral fellow Kallappa Koti (left) in their studies on decompression sickness (DCS). Data are acquired experimentally in hypobaric chambers. A large number of records have been statistically analyzed by the JSC Environmental Physiology Group.
These results were reported in the 1996-97 ISSO Annual Report. Although the two distributions provided better data fits compared to other statistical distributions, yet the model-fits were inadequate. A major reason for their poor model-fits was that the variability in DCS time responses caused by the variation in the environmental physiology was not accounted for. This analysis was then followed by linearly regressing the DCS time, expressed in log scale, on to the possible covariates: exercise, both the pressure and time-at-the-test altitude, and nitrogen pressure (N2) in the 360-minute half-time compartment. The linear regression modeling improved the model-fits for the two distributions; nevertheless, the model inadequacy was still statistically significant.
An alternative modeling approach utilizes the Cox regression model based on the concept of proportional hazard function. A preliminary analysis showed that a proportional hazard model led to a better data fit than previously achieved. This application of the Cox regression model to the DCS data, however, requires analysis of residuals and development of an appropriate goodness-of-fit test.
In this report, we discuss the Cox regression modeling and analysis of DCS data. Residual analyses are made and model-fits are further evaluated.
Cox's Regression Model-Fits
Data modeled and analyzed here contained 1322 observations. An initial analysis identified
observation, numbered 236, as an outlier and highly influential; thus it was deleted.
Accordingly, our data analysis is based on 1321 observations of which 1154 are
right-censored. Besides the DCS times, data are also available on a number of
physiological and related variables. Some of the variables expected to affect DCS are the
following:
Although data on some other variables also exist, these are the only covariates on which measurements are available for all test subjects. The main objective is to estimate the risk of incidence of DCS and to model its occurrence time. One also needs to assess the effect of a covariate on the DCS onset time.
Let z be a row vector of k measured covariates and b be a column vector of k regression parameters. Let T denote the response variable, DCSTIME. The Cox regression model is specified by the hazard function
l(t;z) = l0(t)ezb,
where l0 is an arbitrary and unspecified base-line hazard function. the problem is to estimate b and l0. We use the SAS PHREG procedure2 to estimate these parameters.
The survival function is given by
![]()
See Lawless3 for further details about the proportional hazards models. The Cox's regression model includes P2, PN2360, and EXERCISE. Results are given in Table 1.
Table 1.
| Variable | DF | Parameter Estimate |
Standard Error |
Wald Chi-Square |
p-Value | Risk Ratio |
| P2 | 1 | -1.2784 | 0.09258 | 190.67567 | 0.0001 | 0.278 |
| PN2360 | 1 | 0.660331 | 0.5483 | 145.06204 | 0.0001 | 1.935 |
| EXERCISE | 1 | 0.93088 | 0.23760 | 15.323420 | 0.0001 | 2.535 |
The null hypothesis that all regression coefficients are zero is rejected. However, the Cox's regression models with (i) P2 and EXERCISE as the only two covariates, (ii) PN2360 and EXERCISE as the only two covariates, or (iii) EXERCISE as the only covariate, are not to be found significant.
Residual Analysis and Diagnostics
Because of its flexibility and interpretability, the Cox's regression model is often used
to analyze failure time data. In the presence of a high percentage of censored
observations, as presently is the case, the underlying proportional hazards assumption may
be violated. Consequently, one needs to conduct a careful examination of residuals. The
Martingale residual M(t) is the difference over [0,t], the
observed number of events minus the expected number obtained under the model, or as excess
failures. Martingale residuals are not symmetrically distributed, even when the fitted
model is correct. The deviance residual is a normalized transform of the martingale
residual. These residuals are much more symmetrically distributed about zero. Observations
with large deviance residuals are poorly predicted by the model. The Schoenfeld residual
is defined as the covariate value for the process that failed minus its expected value.
Since the Schoenfeld residuals are, in principle, independent of time, a plot that shows a
non-random pattern against time is evidence of violation of the independence assumption.
The Schoenfeld residuals in Figs. 1(a) and 1(b) are
fairly randomly scattered. The graph for the EXERCISE residuals in Fig. 1(c) is not very informative, which is typical of
graphs for dichotomous covariates. We conclude from the residual analysis that the
proportional hazards assumption holds.
The martingale residuals shown in Fig. 2 are slightly skewed. This might be attributed to the single failure outcome feature of our Cox model. Presently, the estimated mode, median, and mean martingales are -0.217, -0.0426, and 0, respectively. The estimated measure of skewness is 0.793. The points (l.8, -l.9), and (3, -2.3) in Fig. 2 seem to be extremely isolated. Otherwise, we see no indication of a lack of fit of the model to individual observations.
Deviance residuals are plotted in Fig. 3. Clearly, there is a disjunction between two groups of observations: The elongated cluster of points in the lower portion of the graph are all the censored observations, while the points in the upper of the graph are the uncensored observations. Three of the residuals exceed three, which is large enough to warrant concern. Recent experience, however, has shown that deviance residuals do not work well in evaluating model-fits.
Estimated Survival Function
The mean values of the P2 and PN2360 are 5.31 and 8.961 respectively, for that part of the
data for which DCSTIME is less than 5.5. The estimated survivor function S(t;z)
based on these inputs are plotted in Figs. 4(a) and 4(b)
for EXERCISE equal to 0 and 1, respectively.
Given P2 and P2360 fixed, subjects who do not exercise are more likely to survive longer compared to those who exercise. Among the subjects who exercise, the subjects at "high" P2 and PN2360 are more likely to survive longer compared to those at "low" P2 and PN2360. This is also true for those who do not exercise.
The model-fit and residual analyses, as discussed above, show that the Cox's proportional hazards model seems to be satisfactory for analyzing these data.
![]() |
| CHAMBER SIMULATORMichael R. Powell stands at the controls of the NASA decompression testing chamber where the investigative team of Raj S. Chhikara (UHCL) and UHCL post-doctoral fellow Kallappa Koti (background) conduct regression modeling and analysis of decompression sickness data. |
Footnotes
1J. Conkin, K. V. Kumar, M. R. Powell, P. P. Foster, and J. M. Waligora.
"A probabilistic Model of Hyopbaric Decompression Sickness Based on 66 Chamber
Tests." Aviation, Space, and Experimental Medicine 67 (1996): 176-83.
2SAS Institute. SAS/STAT Software: Changes and Enhancement (Release
6.11), Cary, NC.,1996.
3J. F. Lawless. Statistical Models and Methods for Lifetime Data. N. Y.:
John Wiley & Sons, 1982.
References
Chhikara, R. S., F. M. Spears, and K. M. Koti. "Statistical Modeling of Altitude
Decompression Sickness Onset Time," Technical Report, ISSO, University of
Houston (1997).
Collett, D. Modeling Survival Data in Medical Research. London: Chapman & Hall,
1994.
SAS Institute. SAS/STAT User's Guide: Vol. 2, GLM-VARCOMP (Version 6), Cary, NC.,
1990.
| Investigative Team UHCL PI: Raj S. Chhikara, Ph.D., Professor, Statistics
UHCL Co-PI: Floyd M. Spears, Ph.D., Associate Professor, Statistics JSC PI: Michael R. Powell, Space Biomedical Research Institute UHCL Post-Doctoral Fellow: Kallappa Koti, Ph.D., Statistics |
Contents
ISSO -- Institute for Space Systems Operations
1997-1998 Annual Report
|
|