tutorial: Meta-Analysis: Quick Introduction and Python Pipeline

By Julien Hernandez Lallement, lun. 20 avril 2020, in category Tutorial

meta-analysis, neuroscience, python

Meta-Analysis: Quick Introduction and Python Pipeline

Background:

In research, we typically publish our results in the form of articles, where findings are reported together with statistical analysis, that provide a way to assess the strength and plausibility of observed effects.

While statistical analysis is critical to any research paper, concluding on one single study might not be the best approach.

For instance, one might have a too small sample, while still drawing conclusions on results. Small samples limit your ability (in statistical terms, you are lacking power) to have meaningful conclusions about the effects you are observing. Normally (unfortunately, not typically), one runs a priori power analysis using pre collected data. That is: "from what we know about the process we want to investigate, what effect size would we expect?". Using historical data helps you making an informed guess about the expected outcome, which in turns allows you to calculate the required sample size. The math behind this is not too complicated, but I won't be going in detail here. Some open source software like G-Power allow you to make that kind of computation.

Alternatively, you might be looking at a very particular set of data, mostly created by outliers (normally, that would be ruled you with large enough statistical power), so your observed truth would not be the truth of the overall population from which you believe your sample to be taken.

Alternatively, you might as well be doing something wrong with your statistics. I myself remember running ANOVAs, non parametric tests and correlations that required some set of assumptions to be fullfiled, without having checked these assumptions before hand. This is, I think, a problem of how researchers are (or at least how I was) taught to use statistics... but that's for another post!

Meta-analytical approach

Now, when multiple papers investigating a similar process are published, one useful thing to do is to review them together, and see whether trends are emerging in the data collected independently in each paper. These so-called systematic reviews can be quite useful to outline unexplored areas of a process, which in turn would guide researchers to collect the necessary data to populate that missing part of the distribution.

Another approach, maybe one step further, would be to perform a meta-analysis. This differs from a systematic review in that a meta-analysis does not simply reviews the current knowledge of a given process, but also quantifies the strength of statistical effects, and provides an overall estimation.

Having worked in the field of Neuroscience for quite a bit, together with a colleague of mine, we collected data on animal empathy, to quantify the strength of effects reported across the literature.

Here, I intend to report a small part of that paper to share some code that might be useful to people currently working on meta-analytic approaches.

Methods:

Data typically takes the shape of distributions, which have some parameters, like a mean and a standard deviation. You can do all of stuff on these distributions, like comparing them to some value (one sample testing, t-test or Wilcoxon, if parametric assumptions are met ;) ) or compare them to some other distribution (paired testing, t-test or Mann-Whitney), among other things.

That's very useful, because you can tell whether these distributions are significantly different from one another (although this frequentist approach as some downsides as well, but that's for another post on Bayesian statistics).

Statistical significance is the probability that the observed difference between two distributions is due to chance. If the p-value is larger than a threshold arbitrarily chosen (i.e., alpha level, typically equal to 0.05), it is assumed that observed differences can be explained by random noise in the data.

Problem is, with a sufficiently large sample, statistical tests will very often find a significant difference, sometimes for even lower alpha levels. These small differences that emerge as significant are often useless for research results, and no meaningful conclusion can be drawned from them. The fact that p-values are dependent not only on effect sizes, but also on sample size makes it crutial to report other measures. This is why authors working with statistical comparisons should not only report p-values, but also differences in magnitude between the distributions. In the words of Jacob Cohen, "Statistical significance is the least interesting thing about the results. You should describe the results in terms of measures of magnitude –not just, does a treatment affect people, but how much does it affect them.".

A commonly cited example illustrating this problem is a Physicians Health Study, that investigated whether aspirin could help preventing myocardial infarction (MI). The researchers tested around 22.000 subjects, and found that aspirin was associated with a reduction in myocardial infarction, with a significance level < .00001. In other words, very very significant (the kind of value that makes PhD students' head drool...). The study was terminated early than scheduled due to the conclusive findings, resulting in a recommendation of aspirin to treat this problem. Problem was, the effect size was very small (r2 = 0.001), suggesting that actual between group differences were very small and negligeable. The recommendation to use aspirin has since been changed, but for quite some time, this study fueled a treatment that was not very suited to the problem. All due to a focus on p-value...

Conclusion "The primary product of a research inquiry is one or more measures of effect size, not P values.". Gene V. Glass

Extraction of effect sizes

Effect sizes (ES) are a way to boil down the strenght of an observed effect. Typicall, your data will take the shape of distributions, that have parameters such as the mean and the standard deviation (I assume the readers know about these concepts). Since the data I use in this post was based on these parameters, I describe below the analysis pipeline for data preparation.

Assume your first study used two different independent groups, each one with a mean (M) and a standard deviation (SD). It is easy to convert that data in a so-called standardized effect size, given by the simple equation below:

\begin{equation*} \left (ES \right) = \left( \ \frac{M1 - M2}{\sqrt{\frac{(SD1^2*n1)+(SD2^2*n2)}{(n1+n2)}}} \right) (1) \end{equation*}

where M, SD and n are the mean, standard deviation and sample size of each distribution 1 & 2, respectively.

Note that very often, standard error of the mean (SEM) is provided in the graphical representation of data. You can then convert to the SD by using the following formula:

\begin{equation*} \left (SD \right) = \left(SEM * \sqrt{n}\right) (2) \end{equation*}

Assume that the second study you screened used the same group at two different time points (e.g., before and after treatment). Here, you are be dealing with a within-subject design instead of a between-subject one. In this case, you could compute the ES using the following equation:

\begin{equation*} \left (ES \right) = \left( \ \frac{M1_{t+i} - M_t}{\sqrt{SD^2_{t+i}+SD2^2_t}} \right) (3) \end{equation*}

where Mt is the mean initial measurement (usually baseline), M(t+i) is the measurement at a second time point, SDt is the standard deviation of the distribution during the initial measurement, SD(t+i) is the standard deviation of the distribution at the second measurement point, and N represents the sample size of the group

Standardization of effect sizes

In order to bring all measures to the same metric and to ease interpretation, the effect sizes can be transformed to correlation coefficient, called r, following this equation:

\begin{equation*} \left (r \right) = \left( \ \frac{ES}{\sqrt{ES^2+4}} \right) (4) \end{equation*}

In some cases, the relevant statistics are actually provided in the paper, in which case, the following equations can be used:

\begin{equation*} \left (t-statistics; r \right) = \left( \frac{t^2}{\sqrt{t^2+df}} \right) (5) \end{equation*}\begin{equation*} \left (F-statistics; r \right) = \left( \sqrt{\frac{F}{F+df}} \right) (6) \end{equation*}

In some cases (typically in old publications), no statistical information is available nor graphical representation of data, and only p-values are reported. In this case, you could use z-scores if you have p-values reported, using the following equation:

\begin{equation*} \left (r \right) = \left( \frac{z}{\sqrt{N}} \right) (7) \end{equation*}

where z is the z-score value and N is the sample size. You can extract z-scores from z-scpre tables like this one. Remember though, p-values are dependent on group size and should be evaluated cautiously.

Combining effect sizes and comparisons

At some point, we get in troubles because the value of r becomes increasingly skewed as it gets further from 0. To compensate that effect, we can normalize the effect sizes using Fisher transformation described here, following this equation:

\begin{equation*} \left (Z_r \right) = \left (0.5 * Ln (\frac{1+r}{1-r}\right) (8) \end{equation*}

where r is the effect size computed through the methods described above. By convention, $Z_r$ can be converted back to r for ease of interpretation.

If you are dealing with low sample sizes, which can create some bias in your analysis (< 20 or 10 in each group, see Nakagawa & Cuthill, 2007, we computed the unbiased $Z_r$ (zru) value using the equation proposed by Hedges & Olkin, 1985:

\begin{equation*} \left (Z_{ru} \right) = \left (Z_r * (1-\frac{3}{4*(n1+n2)}-9)\right) (9) \end{equation*}

Random and Fix Effect Models

When conducting meta-analytic approaches, it is necessary to use either a fixed effect or a random effects statistical model. A fixed effect model assumes that all effect sizes are measuring the same effect, whereas a random effects model takes into account potential variance in the between-studies effect. Since the chosen model affects the interpretation of the summary estimates, one can test which model to use by conducting a heterogeneity test, that generates the Q-statistic described in eq. 14. The Q value is a measure of the dispersion of the effect sizes. This measure follows the chi square distribution with k-1 degrees of freedom, where k is the total number of effect sizes. This model assumes that the variance of each effect size ($v_i$, eq. 10) is composed of variance due to intrinsic sampling errors ($v_0$, eq. 11 & 12), plus other sources of randomly distributed variability ($v_r$, eq 12). To estimate these values, you can use the formulas 10 through 15 thoroughly described by (Lipsey & Wilson, 2001; Nakagawa & Cuthill, 2007):

\begin{equation*} \left (v_i \right) = \left (v_0 + v_r \right) (10) \end{equation*}
\begin{equation*} \left (v_0 \right) = \left (SE^2 \right) (11) \end{equation*}
\begin{equation*} \left (SE \right) = \left (\frac{1}{\sqrt{n-3}}\right) (12) \end{equation*}
\begin{equation*} \left (v_r \right) = \left (\frac{Q-(k-1)}{(\sum w_i - (\sum w_i^2/\sum w_i)} \right) (13) \end{equation*}
\begin{equation*} \left (Q \right) = \left (\sum w_i * Z_{ru_i}^2 - \frac{(\sum w_i * Z_{ru_i})^2}{\sum w_i} \right) (14) \end{equation*}
\begin{equation*} \left (w_i \right) = \left (\frac{1}{SE^2} \right) (15) \end{equation*}

To complement the Q statistic, one can provide a I^2 statistic using eq. 16, which measures the percent of variance between studies, which is due to true heterogeneity rather than chance.

\begin{equation*} \left (I^2 \right) = \left (\frac{Q-df}{Q}\right) (16) \end{equation*}

where Q is calculated using eq. 14 and df is the number of effect sizes minus one, with higher percent values indicating higher heterogeneity. Typically, the I^2 will be expressed in percent, and can have a p-value associated to it, which, if significant, would indicate a substantial amount of heterogeneity and giving further support for a random model effects.

Calculating confidence intervals for effect sizes

For each variable and its different levels, one can calculated the mean effect size, 95% confidence intervals (CI) and zscore value using eqs. 16, 17, 18 and 19. See Nakagawa & Cuthill, 2007 for more information on this topic, and note that there are other proposed methods to compare these values.

\begin{equation*} \left (\overline{Z_ru} \right) = \left (\frac{(\sum w_i * Z_{ru_i})}{(\sum w_i}\right) (17) \end{equation*}
\begin{equation*} \left (\ 95 \%CI\right) = \left (\overline{Z_ru} \pm 1.96*(SE_{Z_{ru}})\right) (18) \end{equation*}
\begin{equation*} \left (\ SE_{\overline{Z_{ru}}}\right) = \left (\sqrt{\frac{\overline{Z_ru}}{\overline{SE_Zru}}}\right) (19) \end{equation*}
\begin{equation*} \left (\overline{SE}_Zru\right) = \left (\sqrt{\frac{1}{\sum w_i}}\right) (20) \end{equation*}

Additional side computation for meta-analysis

There are other analysis that one might want to do when running a meta-analysis. In particular, file drawer analysis (that allows to estimate how many more effect size one might require to abolish any overall meta-analytic effect found) and funnel plots (that allow to assess publication bias) could be of interest. I won't be talking about these, but you could look at JASP if you think of running some of these on your data.

Dataset:

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math

On our project, we were interested to find which brain regions showed activation when individuals witnessed emotions in others (so-called emotional contagion, i.e., first building block of empathy, see here. I did this work together with Dr. Maria Carrillo, check out some of her work here.
The data was collected manually in most of the cases, because researchers are not always good at reporting all the necessary statistics to extract effect sizes. That mean that we had to go to the papers, with a ruler and a pencil, and measure the actual distribution's means and standard deviations. Fun at first, very fast very annoying... There are some good softwares that help you collect the data from pdf formats though, like WebPlotDigitizer.

Within a year of regular meetings and data compilation, we managed to screen and extract brain activation data (among other data, not discussed here) in approximately 150 research papers (of a total of 3000 papers screened).

In [2]:
# Set path
# Ubuntu
work_dir = "/home/julien/Documents/Projects/datasets/meta_analysis"
# Windows
#work_dir = "C:\\Users\hernandez\Dropbox (Personal)\Python\meta_analysis"
os.chdir(work_dir)
# Verify the path
cwd = os.getcwd() 
# Print the current directory 
print("Current working directory is:", cwd) 
Current working directory is: /home/julien/Documents/Projects/datasets/meta_analysis
In [3]:
files = os.listdir(cwd)
df = pd.read_excel(io = files[files.index("Effect_size_pooled_c-fos_v2.xlsx")], 
                   sheet_name ='c-fos_only',
                   header     = 1,
                   index_col  = None)

# lower case to avoid later conflicts
df.columns = [c.lower() for c in df.columns]
# display columns
df.shape
Out[3]:
(144, 66)

What we have is a series of data features for each independent experimental group that we screened in the literature.
I will hard code the specie from which the data was collected, which will be our main spliting variable in this analysis. Indeed, we ended up with a relatively small dataset, which is not enough to look at how the other variables might have influenced brain activity.

In [4]:
# Create dictionnary of the species encoded
specie_dict ={0 : 'rat', 1 : 'mouse'}
# Map the binary coding onto the dictionnary
df['specie_str'] = df['specie '].map(specie_dict)

Then, I select the data of interest:

In [5]:
# Select data of interest
DOI = df[['study id','specie ', 
          'specie_str', 
          'structure / tissue', 
          'structure_renamed',              
          'standardized es',    
          'standardized es se', 
          'weight factor', 
          'n']]
In [9]:
DOI.head(10)
Out[9]:
study id specie specie_str structure / tissue structure_renamed standardized es standardized es se weight factor n
0 2 0 rat ACC ACC 0.927872 0.235702 18 21
1 2 0 rat LA Amygd_LA 0.927872 0.235702 18 21
2 2 0 rat HPC HPC 0.300000 0.235702 18 21
3 2 0 rat ACC ACC 0.678397 0.258199 15 18
4 2 0 rat HPC HPC 0.089124 0.408248 6 9
5 77 0 rat LA Amygd_LA 0.768121 0.447214 5 8
6 77 0 rat CeM Amygd_CeM 0.308701 0.447214 5 8
7 77 0 rat CeL Amygd_CeL 0.344366 0.447214 5 8
8 77 0 rat PrL PrL 0.455106 0.447214 5 8
9 77 0 rat IL IL -0.372895 0.447214 5 8

Below, a description of the columns:
study_id: single identifier of the study from which the data was extracted
specie: binary informing on the specie from which the brain activity was recorded
specie_str: binary string information informing on the specie from which the brain activity was recorded
structure / tissue: Brain structure from which the activity was recorded
structure_renamed: Renamed structure
standardized es: standardized effect size
standardized es se: standard error of the mean attached to the effect size
weight factor: weight facto
n: number of animals from which the data was collected

Now we are interested in computing the effect size, per specie and brain structure.
Let's look at how many data points we would have per category

In [11]:
(DOI
.groupby(['specie_str', 'structure_renamed'])
.count()
.sort_values('n', ascending=False)
)
Out[11]:
study id specie structure / tissue standardized es standardized es se weight factor n
specie_str structure_renamed
rat Amygd_LA 14 14 14 14 14 14 14
Amygd_CeA 8 8 8 8 8 8 8
PAG 8 8 8 8 8 8 8
IL 7 7 7 7 7 7 7
PVN 7 7 7 7 7 7 7
Amygd_BLA 6 6 6 6 6 6 6
Amygd_BA 6 6 6 6 6 6 6
PrL 6 6 6 6 6 6 6
Hypothalamus 6 6 6 6 6 6 6
Amygd_MeA 5 5 5 5 5 5 5
Amygd_CoA 5 5 5 5 5 5 5
Amygd_CeM 5 5 5 5 5 5 5
ACC 5 5 5 5 5 5 5
NaCC 4 4 4 4 4 4 4
mouse Amygd_BLA 4 4 4 4 4 4 4
rat INS 3 3 3 3 3 3 3
mouse HPC 3 3 3 3 3 3 3
Amygd_LA 3 3 3 3 3 3 3
rat Amygd_CeL 3 3 3 3 3 3 3
BNST 3 3 3 3 3 3 3
BMA 3 3 3 3 3 3 3
PFC 3 3 3 3 3 3 3
HPC 2 2 2 2 2 2 2
Septum 2 2 2 2 2 2 2
mouse Amygd_CeL 2 2 2 2 2 2 2
Amygd_CeM 2 2 2 2 2 2 2
Amygd_CoA 2 2 2 2 2 2 2
Amygd_MeA 2 2 2 2 2 2 2
IL 2 2 2 2 2 2 2
PrL 2 2 2 2 2 2 2
Septum 1 1 1 1 1 1 1
rat LPO 1 1 1 1 1 1 1
mouse Amygd_CeA 1 1 1 1 1 1 1
Hypothalamus 1 1 1 1 1 1 1
BNST 1 1 1 1 1 1 1
INS 1 1 1 1 1 1 1
NaCC 1 1 1 1 1 1 1
PAG 1 1 1 1 1 1 1
PFC 1 1 1 1 1 1 1
PVN 1 1 1 1 1 1 1
ACC 1 1 1 1 1 1 1

That's quite low for many structures (n of 1 will be hard to make meta-analytics...), but we can work with some of these structures that have more 2 or 3 effect sizes.
Seems though like there are much more effect sizes for rats than mice:

In [6]:
(DOI
.groupby(['specie_str'])
.count()['study id']
)
Out[6]:
specie_str
mouse     32
rat      112
Name: study id, dtype: int64

Yup. That is something typical in research literature though. Some models are widely used, leaving some areas of the knowledge unpopuplated. But still, we can move on and see what we get.

Data Transformation:

As explained in the methods, there are different steps needed in a meta-analytic process.

Below, I implemented these steps in python:

In [15]:
# This function will add a few computations to the dataset.
# Computations are self explanatory, and equations are in methods
def compute_meta_data(dataf):
    return (dataf
            .assign(vi_SE2           = lambda d:  1/d['weight factor']) # Equation 15
            .assign(standardized_ESU = lambda d:  d['standardized es'] 
                    * (1-(3/(4*(d['n'])-9)))) # Equation 9
            .assign(wES              = lambda d:  d['standardized es'] 
                    * d['weight factor'])
            .assign(wES2             = lambda d:  d['weight factor']   
                    * d['standardized es'] * d['standardized es']) 
            .assign(w2               = lambda d:  d['weight factor']   
                    * d['weight factor']))
In [16]:
# This function will compute, for each specie&structure, the random effect parameters
def compute_random_effects(dataf):
    return (dataf
            .groupby(['specie_str','structure_renamed'])
            .apply(lambda d: pd.Series({  
            "sum_wi":          d['weight factor'].sum(),
            "sum_wi2":         d['w2'].sum(),
            "k":               d['w2'].count(), #num studies
            "wxES":            d['wES'].sum(),
            "wxES2":           d['wES2'].sum()     
            })))
In [17]:
# Compute the Q and v0 parameters (see methods)
def compute_random_variance(dataf):
    return (dataf
            .assign(Q  = lambda d:  abs(d['wxES2']-(d['wxES']**2/d['sum_wi']))) 
            # For struture with only 1 effect size, the denominator of v0 will 
            # automatically be equal to 0. hence v0 will not be computed.
            .assign(v0 = lambda d:  (d['Q']-(d['k']-1))/(d['sum_wi']-
                                                        (d['sum_wi2']/d['sum_wi'])))
           )
In [18]:
# Turn v0 to 0 if negative (missing reference!)
def zero_if_negative(dataf):
        num = dataf['v0']
        num[num < 0] = 0
        dataf['v0'] = num
        return dataf
In [19]:
# Compute corrected parameters (see methods)
def apply_corrections(dataf):
    return (dataf
            .assign(v0_plus_vi       = lambda d:  d['vi_SE2']+d['v0'])  
            .assign(wi_corr          = lambda d:  1/d['v0_plus_vi'])  
            .assign(wxES_corr        = lambda d:  d['wi_corr']*d['standardized_ESU'])  
            .assign(wxESsq_corr      = lambda d:  d['wi_corr']*(d['standardized_ESU']**2))
            .assign(SE_corr          = lambda d:  (1/(d['wi_corr'])**0.5))  
            .fillna(0) # NaN are replaced by 0
            .replace(np.inf, 0)
)
In [20]:
def calculate_constants(dataf):
    return (dataf
            .groupby(['specie_str', 'structure_renamed'])
            .apply(lambda d: pd.Series({  
                        "sum_wxES_corr":        d['wxES_corr'].sum(),
                        "sum_wxESsq_corr":      d['wxESsq_corr'].sum(), #num studies                  
                        "sum_wi_corr":          abs(d['wi_corr']).sum(), 
             })) 
             .fillna(0) # NaN are replaced by 0
             .replace(np.inf, 0)
           )

The function below computes the $I^2$ value, ie, percent of variation due to heterogeneity rather than chance. By convention, Q = 0 if Q < k-1, so that the precision of a random effects summary estimate will not exceed the precision of a fixed effect summary estimate, see Higgins & Thompson 2002

In [21]:
def I2(Q,dfg):
#     if pd.isnull((Q < dfg)):
#         I2 = 0
#         return I2
#     else:
        I2=((Q-dfg)/Q)*100
        return I2
In [22]:
def calculate_mean_se(dataf):
    return (dataf
            .assign(ES_mean       = lambda d: d['sum_wxES_corr']/d['sum_wi_corr']) 
            .assign(SE_mean       = lambda d: (1/d['sum_wi_corr'])**0.5)
            .assign(z             = lambda d: d['ES_mean']/d['SE_mean'])
            .assign(high_CI       = lambda d: d['ES_mean']+(1.96*d['SE_mean']))
            .assign(low_CI        = lambda d: d['ES_mean']-(1.96*d['SE_mean']))                                              
           # .assign(k_val         = lambda d: d['w2'].count())#num studies  
            .assign(I_val         = lambda d: I2(d['Q'],d['k']-1))
           )
In [23]:
def calculate_number_es(dataf):
    return (dataf
            .groupby(['specie_str', 'structure_renamed'])
            .apply(lambda d: pd.Series({  
                        "k_val": d['w2'].count(),
             })) 
           )
In [24]:
# Compute & add meta variables to dataset
cfos_df = DOI.pipe(compute_meta_data)
# Compute & add random effect parameters
rand_df = (cfos_df
.pipe(compute_random_effects)
.pipe(compute_random_variance)
.pipe(zero_if_negative)
)
# Merge datasets
cfos_ana = pd.merge(cfos_df, rand_df, on=['structure_renamed','specie_str'])
cfos_ana = apply_corrections(cfos_ana)
# Compute constants for each structure
constants = calculate_constants(cfos_ana)
cfos_ana = pd.merge(cfos_ana, constants, on=['structure_renamed','specie_str'])
# Compute mean & se for each structure
cfos_ana = (cfos_ana
.pipe(calculate_mean_se))
# Calculate number of ES per structure
nES = (cfos_ana
.pipe(calculate_number_es)           
)
cfos_ana = pd.merge(cfos_ana, nES, on=['structure_renamed','specie_str'])

# Sample dataframe
cfos_ana.head(3)
Out[24]:
study id specie specie_str structure / tissue structure_renamed standardized es standardized es se weight factor n vi_SE2 ... sum_wxES_corr sum_wxESsq_corr sum_wi_corr ES_mean SE_mean z high_CI low_CI I_val k_val
0 2 0 rat ACC ACC 0.927872 0.235702 18 21 0.055556 ... 18.885442 15.12257 32.517006 0.580787 0.175366 3.311859 0.924503 0.23707 40.641371 5
1 2 0 rat ACC ACC 0.678397 0.258199 15 18 0.066667 ... 18.885442 15.12257 32.517006 0.580787 0.175366 3.311859 0.924503 0.23707 40.641371 5
2 207 0 rat ACC ACC 0.744927 0.242536 17 20 0.058824 ... 18.885442 15.12257 32.517006 0.580787 0.175366 3.311859 0.924503 0.23707 40.641371 5

3 rows × 36 columns

In [26]:
# By convention, Q = 0 if Q < k-1, so that the precision of a random effects summary estimate will not exceed the precision of a fixed effect summary estimate
cfos_ana.loc[cfos_ana['Q'] < (cfos_ana['k_val']-1), 'I_val'] = 0 
In [31]:
# Prepare data for plotting
to_plot = (cfos_ana
#.groupby(['specie_str'])
.drop_duplicates(subset =["specie_str", "structure_renamed"], inplace = False)
.dropna()
#.groupby(['specie_str', 'structure_renamed'])
#.head(20)
)
to_plot = to_plot[['specie_str', 
                   'structure_renamed',
                   'ES_mean', 
                   'low_CI',
                   'high_CI', 
                   'z', 
                   'I_val', 
                   'k_val']]
# Visualize higher effect sizes per species
(to_plot
.groupby(["specie_str"])
.apply(lambda x: x.sort_values(["ES_mean"], ascending = False))
.reset_index(drop=True)
)
Out[31]:
specie_str structure_renamed ES_mean low_CI high_CI z I_val k_val
0 mouse Amygd_BLA 0.305320 -0.012634 0.623274 1.882120 0.000000 4
1 mouse IL 0.250577 -0.273255 0.774409 0.937572 0.000000 2
2 mouse PrL 0.227964 -0.570073 1.026001 0.559887 56.913815 2
3 mouse Amygd_CeL 0.111359 -0.412473 0.635191 0.416668 0.000000 2
4 mouse Amygd_CeM 0.045174 -0.478658 0.569006 0.169027 0.000000 2
5 mouse Amygd_MeA 0.025072 -0.498760 0.548904 0.093812 0.000000 2
6 mouse Amygd_CoA 0.004438 -0.569050 0.577927 0.015169 16.567528 2
7 mouse HPC -0.131467 -0.559174 0.296240 -0.602458 0.000000 3
8 mouse Amygd_LA -0.177468 -0.605175 0.250240 -0.813258 0.000000 3
9 rat ACC 0.580787 0.237070 0.924503 3.311859 40.641371 5
10 rat PrL 0.375268 0.072834 0.677703 2.432017 0.000000 6
11 rat Amygd_LA 0.352532 0.008677 0.696386 2.009459 75.981846 14
12 rat Amygd_BLA 0.350253 -0.002779 0.703284 1.944572 39.774832 6
13 rat NaCC 0.318760 -0.054253 0.691774 1.674926 25.287225 4
14 rat Amygd_BA 0.268681 -0.235761 0.773122 1.043955 77.883006 6
15 rat Amygd_CoA 0.262060 -0.159181 0.683300 1.219345 38.222540 5
16 rat Amygd_MeA 0.237278 -0.247329 0.721886 0.959675 52.759582 5
17 rat HPC 0.235805 -0.164278 0.635889 1.155206 0.000000 2
18 rat PFC 0.218830 -0.095022 0.532681 1.366591 0.000000 3
19 rat Amygd_CeA 0.211875 0.017806 0.405944 2.139829 0.000000 8
20 rat IL 0.188953 -0.051107 0.429012 1.542728 7.079613 7
21 rat Septum 0.174726 -0.195679 0.545131 0.924562 0.000000 2
22 rat BNST 0.174373 -0.117806 0.466553 1.169732 0.000000 3
23 rat Hypothalamus 0.150351 -0.124104 0.424806 1.073719 0.000000 6
24 rat PVN 0.083937 -0.313650 0.481523 0.413786 75.792369 7
25 rat BMA 0.058721 -0.882053 0.999495 0.122339 71.063206 3
26 rat PAG 0.041619 -0.290724 0.373962 0.245449 54.145197 8
27 rat Amygd_CeL 0.020364 -0.485705 0.526434 0.078871 0.000000 3
28 rat INS -0.022349 -0.528419 0.483720 -0.086559 0.000000 3
29 rat Amygd_CeM -0.077602 -0.469602 0.314398 -0.388008 0.000000 5
In [32]:
#import plotnine as p9
import plotnine as p9
from plotnine import ggplot, geom_point, aes, theme, element_text, save_as_pdf_pages
p9.options.figure_size = (6.4, 4.8)
In [39]:
# Plot the effect sizes for each structure, separately for each specie
(p9.ggplot() +
p9.geom_point(data = cfos_ana, 
              mapping = p9.aes('structure_renamed', 
                               'k_val', 
                               color='ES_mean', 
                               size='ES_mean'
                              )
             ) +
p9.facet_wrap("specie_str") + 
theme(axis_text_x=element_text(rotation=90, hjust=1))
)
/home/julien/anaconda3/lib/python3.7/site-packages/plotnine/scales/scale.py:93: MatplotlibDeprecationWarning: 
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if cbook.iterable(self.breaks) and cbook.iterable(self.labels):
/home/julien/anaconda3/lib/python3.7/site-packages/mizani/bounds.py:352: RuntimeWarning: invalid value encountered in less
  outside = (x < range[0]) | (x > range[1])
/home/julien/anaconda3/lib/python3.7/site-packages/mizani/bounds.py:352: RuntimeWarning: invalid value encountered in greater
  outside = (x < range[0]) | (x > range[1])
/home/julien/anaconda3/lib/python3.7/site-packages/plotnine/layer.py:449: UserWarning: geom_point : Removed 11 rows containing missing values.
  self.data = self.geom.handle_na(self.data)
/home/julien/anaconda3/lib/python3.7/site-packages/plotnine/utils.py:553: MatplotlibDeprecationWarning: 
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  return cbook.iterable(var) and not is_string(var)
Out[39]:
<ggplot: (8754481162689)>

Thats's about it! In this notebook, I tried to summarize some equations and practices one might use when running meta-analysis. This work was my own doing, and I do not ensure that all mathemathical implementations are 100% correct, although I reference the papers from which the equations were taken from whenever I can.

I hope you enjoyed it, and do not hesitate in contacting me if you have questions, commentaries, or spot an issue or some room for improvement!

The functions described in this post can be found in the following repo: https://github.com/juls-dotcom/meta_analysis

Cheers!