MDE Application to Fly Neural Data

Drosophila melanogaster

A fly expressing the calcium indicator GCaMP6f as a measure of neuronal activity was recorded walking on a Styrofoam ball. Neuronal activity was segmented by independent component analysis (ICA) yielding 80 component brain areas. Two behavioral variables were simultaneously recorded: forward speed (FWD) and left/right turning speed (Left Right).

Find MDE of neural activity to forward movement.

Import modules

from pandas     import read_csv, DataFrame
from matplotlib import pyplot as plt
%matplotlib ipympl

These modules are required for theEvaluate()function to compare MDE results.

import numpy
import sklearn
import pydiffmap

Import MDE & Evaluate classes

from dimx import MDE, Evaluate

Read Drosophila data

df = read_csv( './data/Fly80XY_norm_1061.csv' )

Instantiate and Run MDE class objects for FWD & Left_Right targets

We use the first 300 time series rows to create the EDM library, and perform out-of-sample prediction on time series rows 301-600. These indices are Not zero offset.

ccmSlope is the minimum slope of a linear fit to the CCM rho(L) curve to validate a causal driver. L is the vector of CCM library sizes at which CCM is evaluated. Default values for L are percentiles [10,15,85,90] of the number of observations (rows).

Fly_FWD = MDE( df,                     # Pandas DataFrame of observables
               target = 'FWD',         # target behavior variable
               removeColumns = ['index','FWD','Left_Right'], # variables to ignore
               D    = 12,              # Max number of dimensions
               lib  = [1,300],         # EDM library start,stop indices
               pred = [301,600],       # EDM prediction start,stop indices
               ccmSlope = 0.01,        # CCM convergence criteria
               embedDimRhoMin = 0.65,  # Minimum rho for CCM embedding dimension
               crossMapRhoMin = 0.5,   # Minumum rho for cross map of target : variables
               cores = 10,             # Number of cores in CrossMapColumns()
               chunksize = 30,         # multiprocessing pool chunksize
               plot = False )
Fly_FWD.Run()
Fly_LR = MDE( df,
              target = 'Left_Right',
              removeColumns = ['index','FWD','Left_Right'], 
              D    = 12,
              lib  = [1,600],
              pred = [801,1000], 
              ccmSlope = 0.01,
              embedDimRhoMin = 0.2,
              crossMapRhoMin = 0.05,
              cores = 10,
              chunksize = 30,
              plot = False )
Fly_LR.Run()

Combine FWD & Left_Right results into one DataFrame and plot

FlyDF = DataFrame( 
    {'D' : [i for i in range(1, len(Fly_FWD.MDEOut)+1)],
     'FWD_variable':Fly_FWD.MDEOut['variables'], 'FWD_rho':Fly_FWD.MDEOut['rho'], 
     'LR_variable' :Fly_LR.MDEOut['variables'],  'LR_rho' :Fly_LR.MDEOut['rho'] } )
FlyDF
D FWD_variable FWD_rho LR_variable LR_rho
0 1 TS33 0.652844 TS26 0.197529
1 2 TS4 0.792290 TS67 0.568969
2 3 TS8 0.819050 TS69 0.599104
3 4 TS9 0.839394 TS65 0.685190
4 5 TS32 0.859089 TS71 0.686665
5 6 TS24 0.860191 TS60 0.706212
6 7 TS26 0.869044 TS53 0.769066
7 8 TS71 0.871098 TS79 0.770243
8 9 TS16 0.864156 TS76 0.745978
9 10 TS44 0.860249 TS57 0.726474
10 11 TS23 0.856260 TS5 0.625226
11 12 TS20 0.851592 TS39 0.659283
ax = FlyDF.plot( x = 'D', y=['FWD_rho','LR_rho'], label = ['FWD','Left_Right'],
                 lw = 3, figsize = (5,4), title = 'Fly MDE' )
table = ax.table( cellText = FlyDF.loc[:,['D','FWD_variable','LR_variable']].values,
                  colLabels = ['D','FWD','LR'], loc = 'lower center', fontsize = 10,
                  colWidths = [0.1,0.12,0.12], edges = 'open', colLoc = 'left', cellLoc = 'left' )
Figure
No description has been provided for this image

MDE indicates forward motion can be represented with a 5-dimensional embedding of observation vectors TS33, TS4, TS8, TS9, TS32. These 5 variables achieve a Pearson correlation of 0.86 between observed and predicted forward movement.


Evaluate MDE components & compare to PCA & Diffusion Map

Here we compare out-of-sample prediction of FWD behavior with the 5 MDE identified observables as well as 5 component PCA and Diffusion Map.

Fly_FWD_Eval = Evaluate( df, 
                         columns_range = [1,81], # 0-offset range of columns for PCA, DMap
                         mde_columns = ['TS33', 'TS4', 'TS8', 'TS9', 'TS32'],
                         predictVar = 'FWD',
                         library    = [1, 300],   # index start,stop of observations for library
                         prediction = [301, 600], # index start,stop of predictions 
                         components = 5,          # Number of PCA & DMap components
                         dmap_k     = 15,         # diffusion_map k nearest neighbors
                         figsize    = (7,6) )
Fly_FWD_Eval.Run()
Fly_FWD_Eval.Plot()
  MDE:  CAE 35.37  RMSE 0.183  R 0.85  R^2 0.722
    PCA:  CAE 53.25  RMSE 0.212  R 0.802  R^2 0.643
    dMap: CAE 47.24  RMSE 0.219  R 0.77  R^2 0.593

Figure
No description has been provided for this image

The MDE prediction has the lowest CAE (cumulative absolute error) to the out-of-sample observations. The diffusion map compoents are latent (not observable) and do not correspond in an obvious way to observed neural dynamics. The PCA prediction lumps the majority of the variance into a single component based on a linear decomposition. Both PCA and diffusion map predict activity during times when no FWD movement is observed, while MDE does not. Crucially, MDE predictions are not latent, but actual observables of the system.