“Mixscape” tool helps researchers understand gene function and regulation
A team of researchers from New York University and the New York Genome Center has developed a new computational tool to help understand the function and regulation of human genes. Its results, published today in the journal Nature Genetics, demonstrate how to interpret experiments that combine the use of CRISPR to perturb genes along with multimodal single-cell sequencing technologies.
The article describes how the new approach, called mixscape, helped to identify a new molecular mechanism for the regulation of immune checkpoint proteins that govern the immune system’s ability to identify and destroy cancer cells.
“Our approach will help scientists to connect genes to the specific cellular behaviors and molecular pathways that they regulate,” explains Rahul Satija, the study’s senior author, who is an associate professor of biology at NYU’s Center for Genomics and Systems Biology and a core faculty member at the New York Genome Center.
The researchers set out to better understand how cancer cells alter the regulation of key genes, such as the immune checkpoint molecule PD-L1, in order to avoid detection and evade the body’s immune system. To do so, they performed a pooled genetic screen, where they ‘knocked out’ a set of genes in a cancer cell line model in order to observe the effect of each change or perturbation on PD-L1 levels. They utilized ECCITE-seq, a technology that allows researchers to capture single-cell profiles of different types of biomolecules—such as RNA and protein—after perturbing each gene with a CRISPR “guide RNA.” The ability to measure multiple types of molecular data, referred to as multimodal analysis, allowed the team to distinguish between transcriptional and post-transcriptional modes of regulation.
After completing its experiments, however, the team realized that significant computational challenges limited its ability to analyze and interpret the data. For example, the researchers found that when they tried to knock out the same gene in multiple different cells, they observed striking variability in the results. In particular, a substantial fraction of cells—up to 75% in some cases—appeared to escape any observable effects after attempted perturbation and represented a confounding source of noise in downstream analysis.
“Facing these challenges made us realize that we needed new computational methods to identify and remove confounding sources of variation in our dataset,” says Efthymia Papalexi, a biology graduate student at NYU and lead author of the study.
To achieve this, the team developed a statistical approach—mixscape—to model each perturbation as giving rise to a mixture of cells with different responses. In doing so, the mixscape method can identify and remove sources of noise from the data, allowing the user to focus on the most important biological signals that remain.
“When we applied mixscape in our screen, we boosted our power to connect gene perturbations with changes in the transcriptome and protein expression. This allowed us to discover that the kelch-like protein KEAP1 and the transcriptional activator NRF2 mediate a cell’s expression level of PD-L1,” says Satija.
While these studies were conducted in cancer cell lines, KEAP1 and NRF2 are frequently mutated in human lung cancer samples, suggesting that these genes may play important roles in the development and progression of human tumors.
Looking forward, the researchers are leveraging multimodal single-cell pooled CRISPR screens and mixscape to understand the molecular regulation of dozens of additional pathways and cellular behaviors.
Mixscape is freely available online through the Satija lab’s Seurat package, a software toolkit for biomedical researchers.
“We hope our method will be useful for the community and assist in the study of how genes and molecular pathways interact with each other,” says Papalexi.
The work was supported by the Chan Zuckerberg Initiative (EOSS-0000000082, HCA-A-1704-01895) and the National Institutes of Health (DP2HG009623-01, RM1HG011014-01, R21HG009748-03).
A presentation on mixscape was featured as part of the new educational series launched this year by the Center for Integrated Cellular Analysis established at the NYGC in 2020. Dr. Satija serves as co-principal investigator and director of this new Center, which is funded through the Centers of Excellence in Genomic Science program of the National Human Genome Research Institute.