*As part of our Journal Club summaries our JC Chairs (Drs. Lisa Calder and Ian Stiell @EMO_Daddy) have been tasked with explaining Epidemiological concepts so that everyone in our department can analyze the literature and appraise articles on their own. For this Blog post we have all the "Epi Lesson" as they relate to "Therapy Articles". More to follow in the coming weeks.*

__Absolute risk reduction__

*By: Dr. Ian Stiell*

*November 2012*
This is a very simple but important concept for interpreting the results of an intervention clinical trial. ARR tells us the difference in outcome proportion or percent between the control group and the intervention group. In the HES trial, Table 2 shows us that the primary outcome of death occurred in 18.0% of HES cases and in 17.0% of Saline cases. Hence the absolute difference was -1.0% [17.0 – 18.0 = -1.0] because HES did worse. Relative risk reduction is a little more complicated and we will do that another time.

__Adjustment Analyses in Randomized Trials__*By Drs. Ian Stiell and Lisa Calder April 2013*
Clinical trials depend upon random allocation to ensure good balance of important baseline characteristics and thus allow a fair comparison between study arms, usually by unadjusted statistical analyses. Commonly, additional secondary analyses will adjust for all important measured covariates using multivariate techniques like logistic regression. These usually confirm the findings of the primary analyses but can only be considered hypothesis generating when the secondary analyses results are different.

__Adjustment of Confidence Intervals for Interim Analyses__**By: Dr. C Vaillancourt**

Interim analyses are commonly planned in large studies. This can be useful, for example, to ensure patient safety (either by finding a significant benefit or harm early on), or as a cost saving measure by stopping the study early if there is no statistical probability to find a difference between groups later on. There are caveats to interim analyses. It is possible for a perceived “futile” study to become significant if allowed to continue on (lack of power to find a difference early on). It is also possible to find a significant difference early on by chance alone. Because the likelihood to find a difference by chance alone increases every time you analyze the data, there are strategies to account for this (such as the O’Brien Fleming Approach), most of which include raising the bar for statistical significance (i.e. p-value much smaller than 0.05 or using >95%CI instead).

Allocation concealmentAllocation concealment

*By: Dr. Lisa Calder December 2012*Allocation concealment is an important principle in RCT design as it helps ensure that study personnel and clinicians are unaware of how a study intervention or control is assigned. Historically, there have been instances where study personnel or clinicians have attempted to “guess” treatment allocation to ensure their patient gets assigned the “right” study group based on their own clinical biases. The robustness of an RCT is enhanced by clear reporting of how allocation was concealed, and even further if the adequacy of their concealment was evaluated.

__Blinding of Treatment Allocation__*By: Dr. Lisa Calder***January 2013**

Open label trials may be promoted as pragmatic trials but a lack of blinding to treatment allocation is a fundamental threat to internal validity. Blinding reduces ascertainment bias (the likelihood of differential assessment of outcome). It is not always possible to undertake blinding in a RCT, especially for surgical or procedural interventions. Drug studies can and must be blinded so readers should be very skeptical when this has not been done. For other studies, ask yourself if it was possible to blind and whether determination of outcome was free of bias.

__Clinical Diversity, Methodological Diversity and Statistical Heterogeneity__
A meta-analysis may attempt to address a compelling clinical dilemma. But one of the key questions to ask when appraising meta-analyses is whether the pooling of the included studies is appropriate. Clinical diversity (or clinical heterogeneity) reflects differences between study populations, the intervention, co-interventions and/or outcomes when pooling studies in meta-analysis. Methodological diversity (or methodological heterogeneity) is variability in the study designs used and/or risks of biases present. These are distinct from statistical heterogeneity which assesses the variability in the intervention effects being assessed in the included studies. This is a consequence of clinical and/or methodological diversity. Statistical heterogeneity can be determined by: visually assessing the forest plot, measuring the I

^{2}statistic or the Cochran’s Q. A meta-analysis should be done only when the studies are relatively homogeneous for participants, interventions and outcomes to provide a meaningful summary. Always ask yourself if the meta-analysis is combining apples with apples (good) or creating a fruit salad (bad).

__Clinically Important Outcomes__

*By:*

*Dr. Ian Stiell*
Outcomes may include survival, clinical events (e.g. strokes or myocardial infarction), patient-reported outcomes (e.g. symptoms, quality of life), process outcomes (length of stay, intubation, imaging), adverse events, and economic outcomes (e.g. cost and resource use). Ideally the most important relevant outcome will be the primary outcome of the study, e.g. mortality for cardiac arrest, pain relief for analgesic studies. Be cautious of studies where processes are the primary outcome because these are of little interest to patients or their families. For example, it would be of little consolation to a family to hear that their loved one was intubated on the first pass but subsequently died.

__Concealment versus Blinding__

**By: Dr. Ian Stiell March 2015**
These clinical trial terms have different meanings but are often confused.

**Concealment**refers to the process**whereby the treatment allocation is made unknown or concealed****prior to patient randomization**. This helps prevent selection bias by ensuring that health providers and research staff are not tempted to include or exclude cases according to their views on the allocated treatment.**Blinding**refers to the methods employed**after randomization**to ensure that patients, health care providers, and research staff cannot determine whether the patient is receiving the study or the control treatment. This reduces ascertainment bias (the likelihood of differential assessment of outcome).

__Contamination in Randomized Trials__

*By: Dr. Ian Stiell May 2014*
This is a type of bias where there is a mixing of treatments between study groups such that the impact of the intervention is difficult to determine. This is most likely to occur in non-drug trials where the intervention cannot be blinded and relies upon physician involvement, e.g. choosing a treatment protocol. One increasingly popular solution to this problem is to randomize by ‘cluster’ e.g. by hospital site, rather by patient.

A cluster randomized trial is a trial in which individuals are randomized in groups (i.e. the group is randomized, not the individual); for example, all patients treated by a particular EMS service or at a particular hospital. Reasons for performing cluster randomized trials vary. Sometimes the intervention can only be administered to the group, for example an addition to the water supply; sometimes the motivation is to avoid contamination amongst health care providers; sometimes the design is simply more convenient or economical. Such trials are often appropriate when the intervention is a psychomotor task (e.g. CPR) but not when the intervention is a drug. Specific sample size and data analytic approaches are required.

__Cluster Randomized Controlled Trials__*By Dr. Ian Stiell**May 2012*A cluster randomized trial is a trial in which individuals are randomized in groups (i.e. the group is randomized, not the individual); for example, all patients treated by a particular EMS service or at a particular hospital. Reasons for performing cluster randomized trials vary. Sometimes the intervention can only be administered to the group, for example an addition to the water supply; sometimes the motivation is to avoid contamination amongst health care providers; sometimes the design is simply more convenient or economical. Such trials are often appropriate when the intervention is a psychomotor task (e.g. CPR) but not when the intervention is a drug. Specific sample size and data analytic approaches are required.

__Determining Safety of a Therapeutic Agent__

*By Dr. Lisa Calder May 2013*
Many therapeutic interventions can have rare but important and sometimes fatal adverse effects. The only way to determine the safety of interventions in this case is to conduct large population studies, often via administrative databases (hence phase 4 of clinical trials designed to evaluate drugs). The critical reader will be wary of small RCTs who purport to have demonstrated safety as an outcome.

__Equivalence or Non-Inferiority Trials__*By Dr. Ian Stiell**May 2012*
Most RCTs aim to determine whether one intervention is superior to another (superiority trials). Often a non-significant test of superiority is wrongly interpreted as proof of no difference between the two treatments. By contrast, equivalence trials aim to determine whether one (typically new) intervention is therapeutically similar to another (usually existing) treatment. A non-inferiority trial seeks to determine whether a new treatment is no worse than a reference treatment. Because proof of exact equality is impossible, a pre-stated margin of non-inferiority (delta) for the treatment effect in a primary patient outcome must be defined a priori. Equivalence trials are very similar, except that equivalence is defined as being within pre-stated a two-sided treatment effect. True (2-sided) equivalence therapeutic trials are rare.

__Explanatory versus Pragmatic Clinical Trials__

*By Drs Ian Stiell & Lisa Calder*

*March 2012*Trials of healthcare interventions are often described as either explanatory or pragmatic. Explanatory trials generally measure efficacy - the benefit a treatment produces under ideal conditions, often using carefully defined subjects in a research clinic. Pragmatic trials measure effectiveness - the benefit the treatment produces in routine clinical practice. Pragmatic trials generally reflect the reality of how the intervention will perform in everyday care. For more, see http://www.bmj.com/content/316/7127/285.full

__Flow Diagram__

*By Dr. Ian Stiell December 2011*Investigators and editors developed the CONSORT Statement (revised 2010 http://www.consort-statement.org/home/) to improve the reporting of randomized controlled trials (RCTs) by means of a checklist and flow diagram. The flow diagram is intended to depict the passage of participants through an RCT and depicts numbers and explanations from four stages of a trial (enrollment, intervention allocation, follow-up, and analysis). The diagram explicitly shows the number of participants, for each intervention group, included in the primary data analysis.

__Geometric Mean__By: Christian Vaillancourt

A geometric mean is a type of mean or average which can be useful when combining items measured on a different scale. It is obtained by calculating the square root of the product of 2 numbers rather than dividing their sum. If we take the example of a survey where one answer is “3” on a 5-point Likert scale and another answer is “7” on a 10-point Likert scale, the usual/arithmetic “mean answer” would be (3+7)/2=5 whereas the geometric mean would be √(3x7)=4.6…which is slightly less influenced by the magnitude of the answer measured on a 10-point Likert scale.

__Hazard Ratio (HR)__*By Dr. Ian Stiell February 2013*The hazard ratio is akin to relative risk but is used for survival analyses such as Cox proportional hazards regression. It is most often used to describe the outcome of therapeutic trials where the question is, to what extent can treatment shorten the duration of an illness. The hazard ratio is an estimate of the ratio of the hazard rate in the treated versus the control group. For example if there are two groups, group 1 and group 2, HR = 4.5 for treatment means that the risk (of relapse) for group 2 is 4.5 times that of group 1.

__Intention-to-treat (ITT) Analyses__*By: Drs. Ian Stiell & Lisa Calder January 2015*
Intention-to-treat (ITT) analyses are widely recommended as the preferred approach to the analysis of most clinical trials. The basic intention-to-treat principle is that participants in trials should be analysed in the groups to which they were randomized, regardless of whether they received or adhered to the allocated intervention. This particularly becomes a problem when patients are lost to follow-up and no outcome values are available. Authors must clearly indicate how many patients have such values missing. The alternate approach is a per protocol analysis which only includes patients for whom the protocol was followed.

__Interim Analyses and Stopping Rules__**By:**

**Dr. Ian Stiell**

In clinical trials, an interim analysis is one that is conducted before data collection has been completed to determine if there are safety issues or if the study should be stopped early. These interim analyses are evaluated by an independent Data Safety Monitoring Board that is at arm’s length from the investigators. The DSMB has the authority to recommend early termination if the study intervention is clearly better than control (for benefit) or if there is so little difference between groups that full enrolment will not show a difference (for futility).

**should be used to adjust the interim P-values to a much severe level, e.g. <0.001 instead of <0.05 using methods described by Pocock and O'Brien & Fleming, among others.***Statistical stopping rules*

__Describing the Strength of Study Results Using “Levels of Evidence”__

*By:*

*Dr. Christian Vaillancourt*
Different methods of classifying levels of evidence have been proposed, most of them relying on the study design, their precision, or their endpoints (e.g. survival with good neurological outcome). The Oxford classification is one commonly quoted where, for e.g., Level 1a is a meta-analyses of RCTs; Level 1b is an individual high-quality RCT; Level 2 includes cohort and low-quality RCTs; Level 3 incudes case-control studies; Level 4 includes case series and low-quality cohort or case-control studies; and Level 5 is expert opinion.

__Loss to Follow-up__*By: Dr. Lisa Calder***Evaluating loss to follow-up is a helpful tool when assessing RCT validity as it provides the reader with a sense of the integrity of the estimated difference in primary outcome. As a general rule of thumb, any loss to follow-up greater than 20% should lead the reader to become concerned about resulting bias of the main result. The reader should ask themselves: if all the patients who were lost to follow-up had the worst possible outcome, to what degree would this influence the statistical and clinical significance of the main result?**

*December 2012*

__Minimal Clinically Important Difference in Clinical Trials__*Dr. Ian Stiell September 2014*
The sample size of a clinical trial must be adequately powered to show a minimal clinically important difference (MCID) between the intervention and control arms. MCID is the absolute difference in outcome proportions that would have to be shown by the study intervention for clinicians to accept the new treatment as better. In an effort to keep sample size low, investigators sometimes estimate an MCID much larger than is reasonable or use an outcome that is not the most important, e.g. 4-hour survival rather than survival to discharge.

__Minimization__

*By:*

*Dr. Christian Vaillancourt*
The purpose of randomization is to minimize imbalance between groups. Sometimes, we know certain factors are likely to influence outcomes and ought to be equally distributed (e.g. male, female). A strategy to ensure this stratifies participants according to important factors, then uses separate randomization lists for each sub-groups. This becomes impractical when a large number of factors need to be taken into account. Minimization calculates the imbalance between groups that would result from a particular assignment, and uses a strategy favoring assignment to the group that would minimize this imbalance between comparison groups.

__Modified Intention-to-treat (M-ITT) Analyses__
Intention-to-treat (ITT) analyses are widely recommended as the preferred approach to the analysis of most clinical trials. The basic intention-to-treat principle is that participants in trials should be analysed in the groups to which they were randomized, regardless of whether they received or adhered to the allocated intervention, crossed over to other treatments, or were withdrawn from the study. Post-randomization exclusions may be acceptable when patients are inappropriately randomized into a clinical trial or when pre-randomization information on patients' eligibility status is not available at the time of randomization. Such an approach is known as “modified intention-to-treat” analysis and must be pre-specified in the protocol. M-ITT is most likely to be seen in RCTs of critical situations, e.g. cardiac arrest.

*Dr. Venkatesh Thiruganasambandamoorthy*

__Multiple Arm Clinical Trials__

*By Dr. Ian Stiell April 2013*
Multiple-arm randomized trials can be more complex in their design, data analysis, and result reporting than two-arm trials. In an RCT with three arms, there are seven theoretically possible comparisons so it is important that the investigators define a priori which comparisons are of primary interest and whether they will assess global differences between all arms and/or will assess pair-wise differences of 2 arms at a time.

__Multiple Comparisons and Statistical Significance__

*By: Dr. Christian Vaillancourt*
It is not uncommon for a manuscript to report several secondary outcomes. The number of secondary comparisons is directly proportional to the chance that one of them will end-up being statistically significant by chance alone. To account for this, statisticians should make it proportionally more difficult to find such a statistical difference. The Bonferroni correction suggests that the level of significance (alpha error, 0.05) should be divided by the number of comparisons made i.e. 0.05/5 comparisons = new alpha of 0.01.

Non-inferiority trials are distinct from superiority trials such that they are designed to determine whether a given intervention is non-inferior by a pre-specified margin compared to a control. This is not the same as equivalence and a key section of the methods to examine is the sample size calculation where the non-inferiority margin is specified. Ideally, researchers explain how this margin was determined (based on previous placebo controlled trials, consensus of experts). The critical reader will ask themselves if they feel this margin is truly clinically significant.

__Non-inferiority Trials__*By: Dr. Lisa Calder January 2013*Non-inferiority trials are distinct from superiority trials such that they are designed to determine whether a given intervention is non-inferior by a pre-specified margin compared to a control. This is not the same as equivalence and a key section of the methods to examine is the sample size calculation where the non-inferiority margin is specified. Ideally, researchers explain how this margin was determined (based on previous placebo controlled trials, consensus of experts). The critical reader will ask themselves if they feel this margin is truly clinically significant.

__Number Needed to Treat (NNT)__**By: Dr. Ian Stiell May 2015**

The NNT concept was created by Canadian Clinical Epidemiologist Dr Andreas Laupacis in 1988 to quantify the benefit of a new intervention. NNT is the average number of patients who need to be treated to prevent one additional bad outcome (e.g. the number of patients that need to be treated for one to benefit compared with a control in a clinical trial). It is easily calculated as the inverse of the absolute risk reduction (1/ARR). The higher the NNT the less effective the treatment.

An Assessment of Clinically Useful Measures of the Consequences of Treatment

An Assessment of Clinically Useful Measures of the Consequences of Treatment

__Phases of a clinical trial__*By: Dr. Ian Stiell November 2012*Clinical trials involving new drugs are classified into four phases with Health Canada and the FDA generally requiring a drug to have passed through Phase 3 before general approval.

**Phase 1**trials test the treatment in a small group of healthy people (20-80) to evaluate its safety, dosage range, and side effects.

**Phase 2**trials give the treatment to patients and in larger numbers (100-300) to evaluate effectiveness and safety.

**Phase 3**trials give the treatment to large groups of paitents (1,000-3,000) to confirm its effectiveness, monitor side effects, and compare to commonly used treatments.

**Phase 4**trials are post-marketing studies to determine additional information about side effects and risks. [

**2a trials**studies focus on proving the hypothesized mechanism of action while the larger

**2b trials**seek to determine the optimum dose]

__Post-Randomization Exclusions__

*B*

*y: Dr. Ian Stiell May 2014*
It is widely accepted that the primary analysis of data in a randomized clinical trial should compare patients according to the group to which they were randomly allocated, regardless of patients' compliance, crossover to other treatments, or withdrawal from the study. Such an analysis is referred to as an intention to treat or an “as randomized” analysis. Exclusions, however, may be acceptable when patients are inappropriately randomized into a clinical trial or when pre-randomization information on patients' eligibility status is not available at the time of randomization. Such an approach is known as “modified intention-to-treat” analysis and is most likely to be seen in RCTs of critical situations, e.g. cardiac arrest.

Pre-specified and Post-hoc Subgroup AnalysesPre-specified and Post-hoc Subgroup Analyses

*By: Dr. Ian Stiell May 2014*
Subgroup analyses involve splitting all the participant data into smaller subsets of subjects (e.g. males and females), so as to make comparisons between them. A pre-specified subgroup analysis is one that is planned and documented before any examination of the data, preferably in the study protocol. Post-hoc analyses are those planned only after examination of the results. Such analyses are of particular concern because it is often unclear how many were undertaken and whether some were motivated by inspection of the data (data-dredging). However, both pre-specified and post-hoc subgroup analyses are subject to inflated false positive rates arising from multiple testing. Subgroup analyses are often under-powered and are best used to generate new hypotheses that can be tested in future trials.

__Precision in RCTs__

*By Dr. Lisa Calder October 2012*
When assessing precision of estimates for RCTs, 95% confidence intervals are most helpful. Precision can be considered the range in which the best estimates of a true value approximate the true value. Interquartile ranges for medians tell you the spread of the data for the sample used for the study, but does not give you an estimate of the probability that the true estimate falls within the range obtained.

__Propensity Score Matching__*By: Dr. Ian Stiell September 2012*
In the statistical analysis of observational data,

**propensity score matching**is one of a family of multivariate statistical techniques that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment. Compared to the gold standard of a randomized controlled trial, any observational analysis and interpretation of the usefulness of an intervention must be viewed with a large degree of healthy skepticism.

__Random sampling vs Randomization__By Dr. Lisa Calder June 2015
While in epidemiology we frequently use the word “random”,
there is sometimes confusion about its application. When a clinical population
is randomly sampled, the goal is to ensure a representative sample. If you
pre-select a sample of patients using inclusion and exclusion criteria and
randomize these to a given intervention and control, you do not necessarily
have a representative sample. Selection bias can still occur pre-randomization.

__Randomization by Pocock minimization algorithm____February 2017__
A random allocation of patients to treatment-control group (by sealed envelopes etc.) generally leads to balanced groups but can lead to differences in the groups on some aspects (more males or obese patients in one arm than the other). If important factors of identified (e.g. sex, obesity) then the patients could be stratified based on sex and BMI. Using block randomization a list is created for a block of x patients to be equally assigned to the study arms based on the important factors identified. If there a large number of important factors, then the block randomization becomes extremely complex. Pocock and Simon adaptive stratified sampling algorithm can be used to calculate the imbalance between the groups based on each factor and add an additional random element to assignment of the next patient.

*By:*

*Dr. Venkatesh Thiruganasambandamoorthy*

__Randomization Procedures__

*By Dr. Lisa Calder March 2013*When reviewing a randomized trial, it is critical to determine how the randomization was conducted as not all randomization schemes are created equal. Proper randomization uses either computer generated randomization or random tables. Pseudo-randomization includes studies where patients are allocated based on alternating days of the week or date of birth for example. The reader can verify that randomization was conducted appropriately by examining table 1 of participant characteristics and determine whether the groups appear to be balanced.

__RCT Sample Size Calculation__*By Dr. Lisa Calder October 2012*

When calculating a sample size for a randomized controlled trial, a key step is to determine the MCID: minimally clinically important difference. By powering your trial towards this difference, not only will you seek a statistically significant difference in effect but also a clinically significant one. It is important as a critical appraiser to evaluate whether you agree that the MCID is truly clinically significant.

Sample Size in Clinical TrialsSample Size in Clinical Trials

*By Dr. Ian Stiell June 2013*
All intervention studies should indicate how the sample size was estimated including the desired alpha error (usually 0.05), power (usually 80-90%), and expected outcome rate in the control group. Most important is a statement of the minimal clinically important difference (MCID) that would have to be shown by the study intervention for clinicians to accept the new treatment as better. In an effort to keep sample size low, investigators sometimes estimate an MCID much larger than is reasonable

__Selection Bias and Randomization__*By Dr. Lisa Calder April 2012*
Even though a clinical trial is randomized, this does not mean it cannot be subject to selection bias. Always look at the study flow figure (generally figure 1) to determine how many eligible patients were not included then assess whether the authors reasonably explain why these eligible but excluded patients were not systematically different from those who were randomized.

__Stratification of Randomization by Timing of Enrolment__*By: Lisa Calder April 2014*
Block randomization offers the benefit of ensuring overall balance of groups when you have multiple centers or clinically defined subgroups. Another approach is to randomize by the timing of enrolment when this could influence the outcome. In this study, the authors stratified their enrolment to account for early and later enrolments given that sepsis is a time sensitive condition. The sensitivity analysis for these strata reassure the reader that the overall observed effect was not influenced by timing of enrolment.

__Survival Analyses__

*By Dr. Ian Stiell February 2013*Survival analyses are used in clinical trials that follow patients over time for primary outcomes such as death, relapse, adverse drug reaction, or development of a new disease. The follow-up time may range from hours to years and a different set of statistical procedures are employed to analyze the data. Terms frequently seen in papers with survival analyses include Cox proportional hazard model, hazard ratio, Kaplan-Meir curve.

**Thiruganasambandamoorthy**

__Surrogate endpoints__By Dr. VenkateshCan be used as a measure of effect for specific treatments and might correlate with clinical outcomes. In the RE-VERSE AD (idarucizumab for dabigatran reversal) study the investigators used dilute thrombin time and ecarin clotting time as surrogate end points for reversal of dabigatran action by the study drug idarucizumab. The actual clinical end point of restoration of hemostasis was a secondary outcome. This was a small study. We need a large clinical study with a control group to confirm that clinical outcomes among patients treated with the reversal agent were better. Be wary of studies using surrogate outcomes when clinical outcomes could have been used.

__Use of Continuous Data as Primary Outcome__

*By Dr. Ian Stiell June 2013**Beware of studies that compare the effectiveness of interventions by using continuous data outcomes, such as pain scales (1-100), oxygen saturation values, and minutes to pain relief. These kinds of data can produce statistically significant differences between groups with relatively small sample sizes but often give you little information about clinical importance. Far better and almost always the norm are outcome measures given as proportions or percentages, such as % of patients who achieve: 20 points improvement in pain, an oxygen saturation of 90%, pain relief in less than 2 hours, or*

*survival.*

__Validation of Measurement Tools__*By: Dr. Lisa Calder June 2014*
When investigators state they used a validated tool, this means that the tool has been evaluated to determine whether it accurately measures what it aims to measure. Two important components of validity include face validity (experts endorse that the tool is logically designed to measure a given construct) and content validity (the tool comprehensively includes all possible dimensions of a given construct). These are distinct from reliability which indicates that the tool consistently measures a given construct (usually by more than one user e.g. inter-rater reliability)

## Comments

## Post a Comment