2206A Student Center
Investigating genome composition in multiple bee species
The honey bee Apis mellifera was the first eusocial animal to have its genome assembled. Analysis of the complete draft sequence of the honey bee genome revealed several interesting features compared with the other metazoan genomes: a low but heterogeneous GC content, an overabundance of CpG dinucleotides and a lack of repetitive elements. The average GC content of the honey bee genome is only 33%, but GC content is highly heterogeneous, ranging from 11% to 67%, with a bimodal distribution. Furthermore, unlike genes in most other metazoans, honey bee genes are overly abundant in regions of low GC content (<30%). Some studies have suggested that the high GC-content regions of the honey bee genome are associated with areas of high meiotic recombination rates; indeed the honey bee exhibits the highest known recombination rate among eukaryotes. Other studies have suggested that honey bee genome nucleotide composition is associated with DNA methylation, which occurs at a low frequency at CpG sites within exons. However, reasons for the highly heterogeneous base composition are not well understood, and whether any of the unusual genome features are related to the emergence of eusociality in bees is not known. Since the publication of the honey bee genome, genomes of several other bee species have become available. I am investigating the composition and organization of genomes of multiple bee species with different levels of social complexity to identify features that are unique to eusocial bees. Results of this exploratory analysis will allow me to develop a hypothesis about the relationship of genome composition to the evolution of eusociality.
2206A Student Center
A Geospatial Health Context Table for Supporting Public Health Research
This project develops a Big Data table that allows researchers to query across and among multiple data sources integrated by location. The big table created in this way uses location as the fundamental linkage between data sets. This is the power of geospatial analysis and forms the foundation for the development and interaction with the Health Context Table. The approach utilizes a dense point file populated with attribution derived or obtained directly from public data sources and associated geospatial analysis. The database created extends across the entire continental United States comprising over 300 million points. The data table has at its core, functional socio-demographic data that is pre-processed, cleaned, integrated and represented in its spatial context. To this core, is being added environmental, infrastructure, cultural, physical, as well as geo-analytically derived layers (i.e. remoteness, isolation). These data span multiple spatial scales (Census Block Group, Zip Code Tabulation Areas, County, etc.). The interface to this Big Data table will allow a user to visualize, data mine, analyze uncertainty, and perform data analytics on these data. The Geospatial Health Context Table’s goal is to address the gap in health research and application for an underpinned spatial framework upon which real-world issues and research can be addressed in the context of place. This work is supported by the NIH T32 Training grant (5T32LM012410-02).
240 Naka Hall
Contrast mining to discover combinations of genetic factors associated with autism subgroups
Autism is characterized by a complex set of behavioral, social, and cognitive deficits. Extensive variation of these phenotypes suggests the existence of autism subtypes that likely have distinct genetic etiologies. The lack of unifying genotypes common to autism patients supports this subtype structure, and suggests that the onset of autism is due to combinations of genetic factors. The ability to precisely diagnose autism subtypes using genetic markers would lead to earlier and more specific treatments and improve outcomes, stressing the need for research which increases our understanding of the genetic etiologies of autism subtypes. In this research, we identify combinations of genetic factors that are associated with groups of autism patients with unifying behavioral profiles, yielding candidate genes to be investigated for their role in the development of these potential autism subtypes. Utilizing methods that combine bioinformatics strategies with data mining practices, we pursue three goals: the discovery of genetic combinations associated to a disease subgroup, the exploration of disease subgroups to find potential subtypes, and the analysis of relationships between genes and subgroups to identify relevant functional interactions.
Winston Haynes, PhD CandidateDate:
2206 A&B Student Center
Understanding disease through integrated molecular and clinical analyses
Abstract:Traditional biomedical experiments are designed to study a single cohort for a single disease using a single technology. By studying disease with such a narrow lens, researchers make discoveries that are not reproducible because they are not representative of the real heterogeneity of disease. By integrating data from over 40 studies and 7,000 patients, we establish a robust signature of disease which correlates with disease activity and persists across blood, tissue, and sorted cell populations. We compare relationships of 104 diseases based on molecular and clinical manifestations from 41,000 gene expression samples and 2 million patient records. Finally, we contextualize single-cell RNA-seq data with bulk gene expression profiles to understand the relationships of novel cell subsets to known cell populations and human disease. By integrating biomedical datasets, my work has enabled an unbiased and multi-scale understanding of disease.
Bio: Winston Haynes is a PhD candidate in biomedical informatics at Stanford University. His research focuses on developing methods to improve understanding of disease through unbiased analyses of heterogeneous, publicly available data. Building off his discovery that publications are biased towards well-annotated genes instead of those with the strongest disease associations, his work integrates molecular and clinical evidence to identify overlooked aspects of disease, including therapeutically actionable relationships between seemingly disparate diseases and novel molecular pathways associated with disease activity
Epigenetic adaptation to environment in long lives trees
Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina, PacBio and Dovetail sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. We next utilize this genome to carry out landscape epigenetics studies. DNA methylation in plants affects transposon silencing, transcriptional regulation and thus phenotypic variation. One unanswered question is whether DNA methylation could be involved in local adaptation of plant populations to their environments. If methylation alters phenotypes to improve plant response to the environment, then methylation sites or the genes that affect them could be a target of natural selection. Using reduced-representation bisulphite sequencing (RRBS) data, we assessed whether climate is associated with variation in DNA methylation levels among 58 naturally occurring, and species-wide samples of valley oak (Quercus lobata) collected across climate gradients. Environmental association analyses revealed 43 specific loci that are significantly associated with any of four climate variables, the majority of which are associated with mean maximum temperature. The 43 climate-associated SMVs tend to occur in or near genes, several of which have known involvement in plant response to environment. Multivariate analyses show that climate and spatial variables explain more overall variance epigenetic than genetic marks. Together, these results from natural oak populations provide initial evidence for a role of CG methylation in locally adaptive evolution or plasticity in plant response
« Previous 1 2 3 4 5 6 … 16 Next »