Scott A. Yuzwa, Michael J. Borrett, Brendan T. Innes, Anastassia Voronova, Troy Ketela, David R. Kaplan, Gary D. Bader, and Freda D. Miller.
Cell Reports (2017).
Adult neural stem cells (NSCs) derive from embryonic precursors, but little is known about how or when this occurs. We have addressed this issue using single-cell RNA sequencing at multiple developmental time points to analyze the embryonic murine cortex, one source of adult forebrain NSCs. We computationally identify all major cortical cell types, including the embryonic radial precursors (RPs) that generate adult NSCs. We define the initial emergence of RPs from neuroepithelial stem cells at E11.5. We show that, by E13.5, RPs express a transcriptional identity that is maintained and reinforced throughout their transition to a non-proliferative state between E15.5 and E17.5. These slowly proliferating late embryonic RPs share a core transcriptional phenotype with quiescent adult forebrain NSCs. Together, these findings support a model wherein cortical RPs maintain a core transcriptional identity from embryogenesis through to adulthood and wherein the transition to a quiescent adult NSC occurs during late neurogenesis.
These are embryonic day 17.5 cortically-derived cells.
Please note: scClustViz is currently suffering from a bug causing plots to temporarily fail to load. The current work-around is to toggle an input to the plot - for interactive plots such as the first figure below left, attempting to zoom in/out, and for others, switching an input (such as swapping between PCA and tSNE/UMAP for the cell embedding plot). We're sorry for the inconvenience, and are working to find and squash the bug now.
Here you can compare the results of clustering at different resolutions to determine the appropriate clustering solution for your data. You can see the cluster solutions represented as boxplots on the left, where each boxplot represents the number of genes differentially expressed between each cluster and its nearest neighbour, or marker genes per cluster. The cluster selected in the pulldown menu is highlighted in red, and the silhouette plot for that cluster is shown on the right. The plot can be zoomed by clicking and dragging to select a region to view, and double-clicking to zoom to it. Double-click again to revert view to default.
A silhouette plot is a horizontal barplot where each bar is a cell, grouped by cluster. The width of each bar represents the difference between mean distance to other cells within the cluster and mean distance to cells in the nearest neighbouring cluster. Distance is Euclidean in reduced dimensional space. Positive silhouettes indicate good cluster cohesion.
Once you've selected an appropriate cluster solution (we suggest picking one where all nearest neighbouring clusters have differentially expressed genes between them), click View clusters at this resolution to proceed. If you want to save this cluster solution as the default for next time, click Save this resolution as default. All figures can be downloaded by clicking the buttons next to each figure.
Here you can explore your dataset as a whole: cluster assignments for all cells; metadata overlays for cell projections; and figures for comparing both numeric and categorical metadata.
The top two figures show cells projected into 2D space using one of the dimensionality reductions calculated in your data object. For example, tSNE and UMAP place cells in space such that proximity indicates transcriptional similarity. PCA is a common input for clustering and cell embedding, and it's important to ensure components don't strongly correlate with technical features. On the left you can see cluster assignments and the nearest neighbours used in the differential expression calculations. If cell type marker genes were provided in RunVizScript.R, it will also show predicted cell type annotations. On the right you can add a metadata overlay to the cell projection. You can select any cluster for further assessment by clicking on a cell from that cluster in the left figure.
Below you can view relationships in the metadata as a scatterplot or compare clusterwise distributions of metadata as bar- or box-plots. If you select a cluster of interest (by clicking on a cell in the top-left plot, or from the list two sections down) it will be highlighted for comparison in these figures.
Here you can explore the significantly differentially expressed genes per cluster. 'DE vs Rest' refers to positively differentially expressed genes when comparing a cluster to the rest of the cells as a whole. 'Marker genes' refers to genes positively differentially expressed versus all other clusters in a series of pairwise tests. 'DE vs neighbour' refers to genes positively differentially expressed versus the nearest neighbouring cluster, as measured by number of differentially expressed genes between clusters. In all cases, Wilcoxon rank-sum tests with false detection rate correction are used.
The dotplot is generated using the differentially expressed genes from the test and number of genes selected below. A dotplot is a modified heatmap where each dot encodes both detection rate and average gene expression in detected cells for a gene in a cluster. Darker colour indicates higher mean normalized gene expression from the cells in which the gene was detected, and larger dot diameter indicates that the gene was detected in greater proportion of cells from the cluster.
Gene expression statistics per cluster can be downloaded as tab-separated text files by selecting the cluster and clicking Download cluster gene stats. These statistics are: mean log-normalized gene expression per cluster (MGE), proportion of cells in the cluster in which the gene was detected (DR), and mean log-normalized gene expression from the cells in which the gene was detected (MDGE). Differentially expressed gene expression test results can be downloaded as tab-separated text files by selecting the test type (under 'Dotplot Genes') and cluster, and clicking Download DE results. Genes used in the dotplot can be viewed in the gene expression plots below as well.
Here you can investigate the expression of individual genes per cluster and across all clusters. The first plot shows mean expression of genes in a cluster as a function of their detection rate and transcript count when detected. The x-axis indicates the proportion of cells in the cluster in which each gene was detected (transcript count > 0), while the y-axis shows the mean normalized transcript count for each gene from the cells in the cluster in which that gene was detected. You can select the cluster to view from the menu below, and genes can be labelled in the figure based on the cell-type markers provided in RunVizScipt.R, the differentially expressed genes from the selected cluster in the above heatmap, or by searching for them in the box below the figure.
Clicking on the first plot will populate the list of genes near the point clicked, which can be found above the next figure. By selecting a gene from this list, you can compare the expression of that gene across all clusters in the second figure. This list can also be populated using the gene search feature. Plotting options for the second figure include the option to overlay normalized transcript count from each cell in the cluster over their respective boxplots ('Include scatterplot'), and the inclusion of the percentile rank of that gene's expression per cluster as small triangles on the plot using the right y-axis ('Include gene rank').
Here you can overlay gene expression values for individual genes of interest on the cell projection. Search for your gene using the search box below, then select your gene(s) of interest from the dropdown 'Select genes' menu. You have the option to include the cluster labels from the first cell projection figure in these plots, and to colour the clusters themselves. There are two copies of this figure for ease of comparison between genes of interest.
Here you can explore the results of pairwise gene expression comparisons between clusters. Any clusters from the currently selected cluster solution can be compared, and you can switch cluster resolutions from the menu here for convenience. Gene effect sizes can be viewed in the context of statistical significance (volcano plots), or directly in modified Bland-Altman plots (axes swapped to match volcano plots). Genes can be labelled by statistical significance, maximum difference, or using the gene search feature above.
Summary statistics of gene expression for either cluster can be downloaded as tab-separated text files using the Download cluster gene stats button under each cluster selection menu. These statistics are: Mean log-normalized transcript count per cluster (mean gene expression - MGE); Proportion of cells in the cluster in which the gene was detected (detection rate - DR); and mean log-normalized transcript count from the cells in which the gene was detected (mean detected gene expression - MDGE).
Differentially expressed gene expression test results for the comparison between selected clusters can be downloaded as tab-separated text file using the Download DE results button. Effect size measures for difference in mean gene expression (gene expression ratio - logGER) and difference in detection rate (dDR), as well as p- and FDR values for tested genes are included in the results.
Similar to the gene expression distribution scatterplot above, clicking on any point in this plot will populate the 'Genes of interest' list above the boxplots comparing gene expression across clusters.
Here you can select sets of cells to directly compare in the figures above, or to generate a new data object for further analysis in R. This can be done manually, or by setting filters on the metadata. Click and drag to select cells manually, and use the buttons below to add or remove the selected cells to/from a set of cells. Filters can be set on metadata by selecting a metadata column from the pulldown menu, and selecting factors / data ranges to include cells. You can include more than one metadata filter, and they will be combined using the logical AND (intersection of sets). You can see the selected cells bolded in the plot. When your cell set(s) are ready, if subsetting just save the subset as a new RData file to disk using the save button. If building a comparison of cell sets for DE testing, name the comparison and click the 'Calculate differential gene expression' button. Once the calculation is done the comparison will be added to the cluster list at the top of the page and the current cluster solution will be updated to show this comparison. The comparison can be saved by clicking 'Save this comparison to disk' next to either cluster solution menu.