CytOpT documentation

The CytOpT is new package includes a new algorithm using regularized optimal transport to directly estimate the different cell population proportions from a biological sample characterized with flow cytometry measurements. Algorithm is based on the regularized Wasserstein metric to compare cytometry measurements from different samples, thus accounting for possible mis-alignment of a given cell population across sample (due to technical variability from the technology of measurements).

Approach

Supervised learning technique based on the Wasserstein metric that is used to estimate an optimal re-weighting of class proportions in a mixture model from a source distribution (with known segmentation into cell sub-populations) to fit a target distribution with unknown segmentation.

Description

A new algorithm, referred to as CytOpT, using regularized optimal transport to directly estimate the different cell population proportions from a biological sample characterized with flow cytometry measurements. It is based on the regularized Wasserstein metric to compare cytometry measurements from different samples, thus accounting for possible mis-alignment of a given cell population across sample (due to technical variability from the technology of measurements).

Overview

The methods implemented in this package are detailed in the following article: Paul Freulon, Jérémie Bigot, Boris P. Hejblum. CytOpT: Optimal Transport with Domain Adaptation for Interpreting Flow Cytometry data https://arxiv.org/abs/2006.09003

The project homepage is https://github.com/sistm/CytOpt-python.

See README for the installation instructions: https://github.com/sistm/CytOpt-python/blob/main/README.md

CyOpt

Overview

CytOpT is a python package that provides a new algorithm relying regularized optimal transport to directly estimate the different cell population proportions from a biological sample characterized with flow cytometry measurements. Algorithm is based on the regularized Wasserstein metric to compare cytometry measurements from different samples, thus accounting for possible mis-alignment of a given cell population across sample (due to technical variability from the technology of measurements).

The main function of the package is CytOpT().

The methods implemented in this package are detailed in the following article: <https://arxiv.org/abs/2006.09003>`_. The source code of the package is available on github.

Getting started

Install the CytOpT package from pypi as follows:

pip install -r requirements.txt

pip install CytOpT # pip3 install CytOpT

Packages:

import numpy as np
import pandas as pd
from CytOpT import CytOpt as cytopt

Preparing data:

# Source Data
Stanford1A_values = pd.read_csv('./tests/data/W2_1_values.csv',
                                usecols=np.arange(1, 8))
Stanford1A_clust = pd.read_csv('./tests/data/W2_1_clust.csv',
                               usecols=[1])

# Target Data
Stanford3A_values = pd.read_csv('./tests/data/W2_7_values.csv',
                                usecols=np.arange(1, 8))
Stanford3A_clust = pd.read_csv('./tests/data/W2_7_clust.csv',
                               usecols=[1])

xSource = np.asarray(Stanford1A_values)
xTarget = np.asarray(Stanford3A_values)
labSource = np.asarray(Stanford1A_clust['x'])
labTarget = np.asarray(Stanford3A_clust['x'])

thetaTrue = np.zeros(10)
for k in range(10):
    thetaTrue[k] = np.sum(labTarget == k + 1) / len(labTarget)
# Initialization of parameters
nItGrad = 5000
nIter = 5000
nItSto = 10
pas_grad = 10
eps = 0.0005
monitoring = True

# Run Minmax and Desasc
res = cytopt.CytOpT(xSource, xTarget, labSource,thetaTrue=thetaTrue,
                    method="both", nItGrad=nItGrad, nIter=nIter, nItSto=nItSto,
                    stepGrad=pas_grad, eps=eps, monitoring=monitoring)

# CytOpT Minmax with default params
cytopt.CytOpT(xSource, xTarget, labSource, thetaTrue=thetaTrue, method='desasc')

# CytOpT Desasc with default params
cytopt.CytOpT(xSource, xTarget, labSource, thetaTrue=thetaTrue, method = 'minmax')
KLPlot:
  • Display a bland plot in order to visually assess the agreement between CytOpt estimation of the class proportions and the estimate of the class proportions provided through manual gating.

barPlot:
  • Display a bland plot in order to visually assess the agreement between CytOpt estimation of the class proportions and the estimate of the class proportions provided through manual gating.

resultPlot(res, n0=10, nStop=4000)

Indices and tables