Your search for all content returned 16 results
In oncology research and in biostatistics in general, data lies at the center of what one does. Data is the information gathered to answer a specific research question. The data one collects typically describes the subject, the intervention, or the outcome and is either quantitative or qualitative in nature. One often summarizes the data from a population or sample based on the center of the distribution. When summarizing data from a population or sample, one also needs to know how spread out observations are. Measures of dispersion help us understand the spread of data. Another aspect that is just as important for understanding data is the visualization of the data. Seeing the data visually represented in graph form can provide new understanding and perspective, as well as provide further details for analyzing that data. This chapter explores some common types of graphs such as histograms, box plot, and scatter plot.
The purpose of research and data analysis is to study and make conclusions about a population. Examples of populations in oncology include patients with stage III lung cancer, patients with metastatic breast cancer, or patients who receive a new chemotherapeutic agent or radiotherapy with a novel technique. Because it is often inconvenient, impractical, or impossible to study an entire population, a sample typically has to be chosen to represent the population. This representative sample is the group that will be studied to make determinations about an entire population. If the sample appropriately typifies a population, conclusions drawn about the sample may be directly applied to the population at large. The most scientifically appropriate sample is a simple random sample. Other sampling methods include probability sampling such as systematic sampling, stratified sampling, probability-proportional-to-size sampling, cluster sampling, quota sampling, minimax sampling, accidental sampling, line-intercept sampling, panel sampling, snowball sampling, or theoretic sampling.
Survival analysis is the cornerstone of oncologic biostatistics. Survival analysis compares time to an event between multiple groups. Survival analyses use time-to-event data. Kaplan–Meier curves are the most basic analysis of time-to-event data. It measures the fraction of patients living at a specific time after intervention or treatment. The median survival of a Kaplan–Meier survival curve is represented as the time point at which the probability of surviving is 50%. The log-rank test is a test used to compare time-to-event data between two groups. It should be used to compare groups in which data observations have been censored. If censored observations are not present in the data, then the Wilcoxon rank-sum test is the appropriate test. The Cox proportional hazards model is a survival model that analyzes time-to-event data, and also accounts for the effect of covariates. This chapter also describes hazard ratio.
An understanding of biostatistics is necessary for reading and comprehending published literature, for performing retrospective research, and for designing and analyzing prospective clinical trials. Biostatistical concepts are also tested on oncology board exams. This book is organized into four sections covering 13 chapters. Section I begins with the basic foundations of biostatistics that are tested on board exams such as summarizing and graphing data, sampling, and statistical estimation. In Section II, these basics are then expanded on to include the concepts used in retrospective study design, analysis, and interpretation. It discusses hypothesis testing, correlation, regression, categorical data analysis, survival analysis methods, and noninferiority analysis. Section III focuses on prospective clinical trials, guiding readers in their understanding of published clinical trials and in the design and analysis of novel clinical trials. It describes cohort studies, case-control studies, cross-sectional studies, matched studies, analysis of studies, and sample size. The final section presents self-study multiple choice questions with answers and rationales.
This chapter provides a guide to the most appropriate statistical test to use for a given analysis based on the type and number of independent and dependent variables. The variables can be qualitative or quantitative. The statistical tests presented are: independent t-test, paired t-test, ANOVA, simple linear regression, correlation, multiple linear regression, chi-square test, Fisher's exact test, logistic regression, log-rank test and Cox-proportional hazards regression.
Biostatistics is the application of statistics to the biologic sciences. Biostatistics is used commonly in medicine, particularly as it relates to medical research. Data collection, data analysis, and interpretation of results are all important components of biostatistics. The basic biostatistical concepts outlined in the book are tested on the medical oncology, hematology, and radiation oncology specialty board examinations because understanding these concepts is critical to practicing oncology. Comprehension of biostatistics is essential to reading and analyzing oncologic publications and to designing, performing, and analyzing retrospective and prospective clinical research.
In biostatistics, researchers use data to address questions. The questions are posed as hypotheses, and the data is analyzed to test these hypotheses in a process known as hypothesis testing. Hypothesis testing begins with formation of the null hypothesis (H0). Statistical conclusions are based on likelihood or chance. With chance comes the potential for error. For hypothesis testing, there are two main types of errors: Type I error and Type II error. Decisions in hypothesis testing are based on the likelihood that the data is the result of random chance versus the likelihood that the data is the result of a relationship between the phenomena being investigated. p-values are calculated to determine these chances. Other statistical concepts discussed are t-test, Wilcoxon rank-sum test, Wilcoxon signed-rank test, analysis of variance, binomial proportions, sensitivity and specificity, negative predictive value, positive predictive value, positive likelihood ratio, and negative likelihood ratio.
Correlation and regression are used to discuss relationships between variables. A correlation examines the linear relationship between two quantitative variables. A correlation coefficient measures the extent to which two variables tend to change together. Pearson correlation coefficient is strongly biased toward linear trends. The Spearman rank correlation measures the strength and direction of the monotonic relationship between ranked variables. Medical researchers most often use regression as a summary of data. Regression mathematically relates one variable to one or more other variables. With a simple linear regression, scores from one independent variable are used to predict scores on a dependent continuous variable. If there is more than one independent variable, then the regression relationship is no longer simple, and is described as multiple linear regression. Logistic regression is a regression with a binary dependent variable. Logistic regression measures the relationship between one or more independent variables and the dichotomous dependent variable.