Prepare data
Prepare anndata for training the model
- utils.get_training_data(remove_clusters=None, cells_per_cluster=100, cluster_column='clusters', min_shared_counts=10, n_var_genes=2000)
Reduces and anndata object to the most relevant cells and genes for understanding the differentiation trajectories in the data.
- Parameters:
adata – AnnData Object
remove_clusters – Names of clusters to be removed
cells_per_cluster – How many cells to keep per cluster. For Louvain clustering with resolution = 1, keeping more than 300 cells per cluster does not provide much extra information.
cluster_column – Name of the column in
adata.obsthat contains cluster namesmin_shared_counts – Minimum number of spliced+unspliced counts across all cells for a gene to be retained
n_var_genes – Number of top variable genes to retain
- Returns:
AnnData object reduced to the most informative cells and genes
- Return type:
AnnData