Dose Selection/Optimization
Published:
Some thoughts about dose selection, for toxicology study.
Goal
Most of my work as statistician at Corteva Agriscience involves analyzing data from toxicity experiments. Chemical safety assessment encompasses the qualitative description of the toxic properties and also a quantification of exposure and toxic response. The chemical components under study can be but not limited to active ingredients of fungicide and pesticide. I have worked on two main scientific questions related to these toxicants: 1) a chemical/toxicant will enter the body at what rate and what happens to it once it is in the body, which is essentially a toxicological kinetics problem; and 2) what is the potential for biological, chemical or physical stressors to affect ecosystems, which is essentially an ecotoxicology problem. To be more specific, addressing the first question needs statistical analysis of toxicant concentration over time and also over different dosages. And addressing the second question needs statistical analysis of different ecological endpoints. But from a statistical perspective, the research questions are more about analyzing the dose-response curve and finding the optimal dose (either for safety assessment or effectiveness).
For toxickinetics, I found this nice webpage explaining it. For ecotoxicology, let’s consider an example of a certain pesticide potentially affects birds. That is, we need to know if and how this certain toxicant affects avians’ general health and more specifically reproduction system. We are interested in endpoints like mortality, weight, eggs laid, eggs hatched, hatchlings’ mortality, hatchlings’ weight, hens’ weight gain/loss, etc.
Study Design
Toxicity experiments must follow regulation guidelines. However, how we could use limited experimental subjects in the most efficient way is a very important question. Moreover, when randomized experiments are not possible, for example questions related to some rare or endangered species, how can we conduct efficient statistical analysis? As I have not conduct any independent research on questions arising from design, I will not discuss experimental design in this blog. For observation study, I believe causal inference will be a very powerful tool to provide insight.
Dose-response curves
Dose-response relationship is typically non-linear, which cause some trouble. In term of non-linear modeling, a first obstacle is how to estimate model parameters. Now, many numerical algorithms are available, but for non-linear model, convergence is not guaranteed and final estimates may be sensitive to initial values. This is usually not a big concern in practice as we can visually assess the raw data and pick reasonable starting values and/or check the results with multiple starting values. A trickier question is which non-linear model should I use and how many parameters do I need? In practice, we do not know the truth, and through my simulation studies, I found correct model may not show superiority. A key goal of modeling dose-response curves is to approximate and visualize the true relationship, which means as long as the approximation error is under reasonable bound, analyses are in good shape. With respect to the number of parameters to be included, my opinion is try simple model first and then compare it with more complex models. In other word, in practice, we are trying to find some models that explain the data well and give us valuable information back. It is important to realize this and integratively use the information provided by multiple models.
Carveat: extrapolation
Another question I have come across in my work is extrapolation. For example, there is not much changes in a certain response. In this case, it is challenging to identify a dose level such that it corresponds to certain percent changes in response. Nevertheless, it is worth to fit some non-linear models first to have a taste of the data. Besides statistical modeling, summary statistics of the data are not only intuitive but also revealing much information. It is always a good stratagy to take a look at the summary statistics before analysis and many times, we can notice and expect extrapolation by checking summary statistics and so be more careful when modeling the non-linearity.
Model Averaging
As when I discuss dose-response curves, I mention there are different candidate models to capture dose-response relationship. One go-to method is using AIC, BIC type criteria to find one winning model. However, again, winning model may not be the correct model and other “lost” models may still have valuable information from the data. It is not worth it to throw away all other models especially when some of them fit the data well. An integrative method is to leverage all the models together to learn the underlying dose-response curves. There are several benefit of model averaging. First, it can reduce uncertainty (but may induce bias). Second, given none of the single model is the correct model, model averaging can give us more robust curves, the true approximation error is no larger than the worst one.
Of course, there are some limitations as well. It is unknown how to correctly determine the weights for each model, and unknown how to most efficiently determine the candidate model. Also, asymptotic inference of model-averaging estimates is challenging.
My current remedies for the above two limitations are manually determining good model to conduct model averaging and use bootstrapping to conduct inference. These remedies are working so far but lack of theoretical justifications.
Covariate Adjustment?
- $Y=f(U)+X\beta$?
- $Y=f(U,X)$?
- $Y=f(U, X, U\times X)$?
As dose considered as fixed if we use ANOVA type linear model, and continuous if we use non-linear regression model, if it is necessary to adjust covariate is very much worth discussing.
Acknowlegdement
I greatly appreciate the statistician opportunity provided by Corteva Agriscience, and superior mentorship provided by my manager Xiaoyi Sopko. I would not have learnt these knowledge without her. I also want to thank Dr. John Green for his help through out my research, especially his work “Statistical Analysis of Ecotoxicity Studies”, and my colleague at Corteva.