Brenda L. K. Coles, Mahmoud Labib, Mahla Poudineh, Brendan T. Innes, Justin Belair-Hickey, Surath Gomis, Zongjie Wang, Gary D. Bader, Edward H. Sargent, Shana O. Kelley and Derek van der Kooy
Loss of photoreceptors due to retinal degeneration is a major cause of untreatable visual impairment and blindness. Cell replacement therapy, using retinal stem cell (RSC)-derived photoreceptors, holds promise for reconstituting damaged cell populations in the retina. One major obstacle preventing translation to the clinic is the lack of validated markers or strategies to prospectively identify these rare cells in the retina and subsequently enrich them. Here, we introduce a microfluidic platform that combines nickel micromagnets, herringbone structures, and a design enabling varying flow velocities among three compartments to facilitate a highly efficient enrichment of RSCs. In addition, we developed an affinity enrichment strategy based on cell-surface markers that was utilized to isolate RSCs from the adult ciliary epithelium. We showed that targeting a panel of three cell surface markers simultaneously facilitates the enrichment of RSCs to 1:3 relative to unsorted cells. Combining the microfluidic platform with single-cell whole-transcriptome profiling, we successfully identified four differentially expressed cell surface markers that can be targeted simultaneously to yield an unprecedented 1:2 enrichment of RSCs relative to unsorted cells. We also identified transcription factors (TFs) that play functional roles in maintenance, quiescence, and proliferation of RSCs. This level of analysis for the first time identified a spectrum of molecular and functional properties of RSCs.
Please note: scClustViz is currently suffering from a bug causing plots to temporarily fail to load. The current work-around is to toggle an input to the plot - for interactive plots such as the first figure below left, attempting to zoom in/out, and for others, switching an input (such as swapping between PCA and tSNE/UMAP for the cell embedding plot). We're sorry for the inconvenience, and are working to find and squash the bug now.
Here you can compare the results of clustering at different resolutions to determine the appropriate clustering solution for your data. You can see the cluster solutions represented as boxplots on the left, where each boxplot represents the number of genes differentially expressed between each cluster and its nearest neighbour, or marker genes per cluster. The cluster selected in the pulldown menu is highlighted in red, and the silhouette plot for that cluster is shown on the right. The plot can be zoomed by clicking and dragging to select a region to view, and double-clicking to zoom to it. Double-click again to revert view to default.
A silhouette plot is a horizontal barplot where each bar is a cell, grouped by cluster. The width of each bar represents the difference between mean distance to other cells within the cluster and mean distance to cells in the nearest neighbouring cluster. Distance is Euclidean in reduced dimensional space. Positive silhouettes indicate good cluster cohesion.
Once you've selected an appropriate cluster solution (we suggest picking one where all nearest neighbouring clusters have differentially expressed genes between them), click View clusters at this resolution to proceed. If you want to save this cluster solution as the default for next time, click Save this resolution as default. All figures can be downloaded by clicking the buttons next to each figure.
Here you can explore your dataset as a whole: cluster assignments for all cells; metadata overlays for cell projections; and figures for comparing both numeric and categorical metadata.
The top two figures show cells projected into 2D space using one of the dimensionality reductions calculated in your data object. For example, tSNE and UMAP place cells in space such that proximity indicates transcriptional similarity. PCA is a common input for clustering and cell embedding, and it's important to ensure components don't strongly correlate with technical features. On the left you can see cluster assignments and the nearest neighbours used in the differential expression calculations. If cell type marker genes were provided in RunVizScript.R, it will also show predicted cell type annotations. On the right you can add a metadata overlay to the cell projection. You can select any cluster for further assessment by clicking on a cell from that cluster in the left figure.
Below you can view relationships in the metadata as a scatterplot or compare clusterwise distributions of metadata as bar- or box-plots. If you select a cluster of interest (by clicking on a cell in the top-left plot, or from the list two sections down) it will be highlighted for comparison in these figures.
Here you can explore the significantly differentially expressed genes per cluster. 'DE vs Rest' refers to positively differentially expressed genes when comparing a cluster to the rest of the cells as a whole. 'Marker genes' refers to genes positively differentially expressed versus all other clusters in a series of pairwise tests. 'DE vs neighbour' refers to genes positively differentially expressed versus the nearest neighbouring cluster, as measured by number of differentially expressed genes between clusters. In all cases, Wilcoxon rank-sum tests with false detection rate correction are used.
The dotplot is generated using the differentially expressed genes from the test and number of genes selected below. A dotplot is a modified heatmap where each dot encodes both detection rate and average gene expression in detected cells for a gene in a cluster. Darker colour indicates higher mean normalized gene expression from the cells in which the gene was detected, and larger dot diameter indicates that the gene was detected in greater proportion of cells from the cluster.
Gene expression statistics per cluster can be downloaded as tab-separated text files by selecting the cluster and clicking Download cluster gene stats. These statistics are: mean log-normalized gene expression per cluster (MGE), proportion of cells in the cluster in which the gene was detected (DR), and mean log-normalized gene expression from the cells in which the gene was detected (MDGE). Differentially expressed gene expression test results can be downloaded as tab-separated text files by selecting the test type (under 'Dotplot Genes') and cluster, and clicking Download DE results. Genes used in the dotplot can be viewed in the gene expression plots below as well.
Here you can investigate the expression of individual genes per cluster and across all clusters. The first plot shows mean expression of genes in a cluster as a function of their detection rate and transcript count when detected. The x-axis indicates the proportion of cells in the cluster in which each gene was detected (transcript count > 0), while the y-axis shows the mean normalized transcript count for each gene from the cells in the cluster in which that gene was detected. You can select the cluster to view from the menu below, and genes can be labelled in the figure based on the cell-type markers provided in RunVizScipt.R, the differentially expressed genes from the selected cluster in the above heatmap, or by searching for them in the box below the figure.
Clicking on the first plot will populate the list of genes near the point clicked, which can be found above the next figure. By selecting a gene from this list, you can compare the expression of that gene across all clusters in the second figure. This list can also be populated using the gene search feature. Plotting options for the second figure include the option to overlay normalized transcript count from each cell in the cluster over their respective boxplots ('Include scatterplot'), and the inclusion of the percentile rank of that gene's expression per cluster as small triangles on the plot using the right y-axis ('Include gene rank').
Here you can overlay gene expression values for individual genes of interest on the cell projection. Search for your gene using the search box below, then select your gene(s) of interest from the dropdown 'Select genes' menu. You have the option to include the cluster labels from the first cell projection figure in these plots, and to colour the clusters themselves. There are two copies of this figure for ease of comparison between genes of interest.
Here you can explore the results of pairwise gene expression comparisons between clusters. Any clusters from the currently selected cluster solution can be compared, and you can switch cluster resolutions from the menu here for convenience. Gene effect sizes can be viewed in the context of statistical significance (volcano plots), or directly in modified Bland-Altman plots (axes swapped to match volcano plots). Genes can be labelled by statistical significance, maximum difference, or using the gene search feature above.
Summary statistics of gene expression for either cluster can be downloaded as tab-separated text files using the Download cluster gene stats button under each cluster selection menu. These statistics are: Mean log-normalized transcript count per cluster (mean gene expression - MGE); Proportion of cells in the cluster in which the gene was detected (detection rate - DR); and mean log-normalized transcript count from the cells in which the gene was detected (mean detected gene expression - MDGE).
Differentially expressed gene expression test results for the comparison between selected clusters can be downloaded as tab-separated text file using the Download DE results button. Effect size measures for difference in mean gene expression (gene expression ratio - logGER) and difference in detection rate (dDR), as well as p- and FDR values for tested genes are included in the results.
Similar to the gene expression distribution scatterplot above, clicking on any point in this plot will populate the 'Genes of interest' list above the boxplots comparing gene expression across clusters.
Here you can select sets of cells to directly compare in the figures above, or to generate a new data object for further analysis in R. This can be done manually, or by setting filters on the metadata. Click and drag to select cells manually, and use the buttons below to add or remove the selected cells to/from a set of cells. Filters can be set on metadata by selecting a metadata column from the pulldown menu, and selecting factors / data ranges to include cells. You can include more than one metadata filter, and they will be combined using the logical AND (intersection of sets). You can see the selected cells bolded in the plot. When your cell set(s) are ready, if subsetting just save the subset as a new RData file to disk using the save button. If building a comparison of cell sets for DE testing, name the comparison and click the 'Calculate differential gene expression' button. Once the calculation is done the comparison will be added to the cluster list at the top of the page and the current cluster solution will be updated to show this comparison. The comparison can be saved by clicking 'Save this comparison to disk' next to either cluster solution menu.