Member, Molecular Oncology & Biomarkers
GRU Cancer Center
Assistant Professor of Biostatistics and Epidemiology
Georgia Regents University
1410 Laney Walker Blvd., CN-2152
Augusta, GA 30912
Recent advances in next generation sequencing (NGS) technologies generate high throughput data cost effectively and enable us to understand biological processes in organisms. More importantly, NGS technologies are being applying to study cancer and other diseases by identifying sequence variants, gene expression, transcription binding sites, DNA methylation, and nucleosome position using Exome-capture-seq, RNA-seq, ChIP-seq, MBD-seq, BS-seq, and Nucleosome-seq. Yet the development of programs or methodologies to analyze these kinds of high throughput data is still slow.
My research focuses on developing computational algorithms and pipelines to analyze NGS data for various biological applications. One of the most challenges is a huge amount of data. For example, Illumina GA IIx can generate 160 million reads in a batch run. HiSeq 2000 can generate 1 billion reads and will 3 billion reads with TruSeq v3 Reagent Kits. It requires sophisticated computational algorithms and data structures to assemble sequence reads to original sequences or to map them to reference sequences such as genomes and transcriptomes. Currently popular methods are suffix array, BWT (Burrow-Wheeler Transform), hashing, and bit-parallelism.
NGS technologies allow us to study genome-wide phenomena and even comparison across a lot of cells, e.g., hundreds of normal and tumor cells. It leads us to develop different computation and statistics to interpret the data. For instance, ChIP-seq (chromatin immunoprecipitation with sequencing) needs to develop new statistical models to identify enriched regions, and BS-seq (bisulfite sequencing) needs new mapping strategies because thymines in sequence reads are either originally thymines or unmethylated cytosines converted by bisulfite treatment. While microarray experiments and Sanger sequencing have examined particular genomic regions such as genes and promoters, NGS can measure entire genome in base level. Therefore, we need to develop how to identify biological signatures from each sequencing type as well as to integrate heterogenous datasets to discover biological phenomena.
Please see link to Dr. Choi’s laboratory. http://comics.georgiahealth.edu
- National Institute on Alcohol Abuse and Alcoholism: Developmental Pathways, Environmental Agents, and Epigenetics in Liver Disease, Co-I
- Department of Defense: Developmental Pathways, Environmental Agents, and Epigenetics in Liver Disease, Co-I
Madhav D. Sharma, Lei Huang, Jeong-Hyeon Choi, Eun-Joon Lee, James M. Wilson, Lemos Henrique, Fan Pan, Bruce R. Blazar, Drew M. Pardoll, Andrew L. Mellor, Huidong Shi, and David H. Munn. “An inherently bi-functional subset of Foxp3+ Treg/T-helper cells is controlled by the transcription factor Eos,” Immunity, 2013. Available at http://www.cell.com/immunity/abstract/S1074-7613%2813%2900192-1
Bilian Jin, Jason Ernst, Rochelle L. Tiedemann, Hongyan Xu, Suhas Sureshchandra, Manolis Kellis, Stephen Dalton, Chen Liu, Jeong-Hyeon Choi*, and Keith D. Robertson*. “Linking DNA Methyltransferases (DNMTs) to Epigenetic Marks and Nucleosome Structure Genome-Wide in Human Tumor Cells,” Cell Report 2(5):1411-1424, 2012. (Co-Corresponding author) Available at http://www.cell.com/cell-reports/abstract/S2211-1247%2812%2900372-5
Hongseok Tae, Dongsung Ryu, Suhas Sureshchandra, and Jeong-Hyeon Choi. “ESTclean: a cleaning tool for next-gen transcriptome shotgun sequencing,” BMC Bioinformatics, 13:247, 2012. Available at http://www.biomedcentral.com/1471-2105/13/247
Lirong Pei*, Jeong-Hyeon Choi*, Jimei Liu, Eun-Joon Lee, Brian McCarthy, James M. Wilson, Ethan Speir, Farrukh Awan, Hongseok Tae, Gerald Arthur, Jennifer L. Schnabel, Kristen H. Taylor, Xinguo Wang, Dong Xu, Han-Fei Ding, David H. Munn, Charles Caldwell and Huidong Shi. “Genome-wide DNA methylation analysis reveals novel epigenetic changes in chronic lymphocytic leukemia,” Epigenetics, 7(6):567-578, 2012. (Co-First author) Available at http://www.landesbioscience.com/journals/epigenetics/article/20237/
Eun-Joon Lee, Lirong Pei, Gyan Srivastava, Trupti Joshi, Garima Kushwaha, Jeong-Hyeon Choi, Keith D. Robertson, Xinguo Wang, John K. Colbourne, Lu Zhang, Gary P. Schroth, Dong Xu, Kun Zhang, and Huidong Shi. “Targeted bisulfite sequencing by solution hybrid selection and massively parallel sequencing,” Nucleic Acids Research, 39(19):e127, 2011. Available at http://nar.oxfordjournals.org/content/39/19/e127
Jeong-Hyeon Choi, Yajun Li, Juyuan Guo, Lirong Pei, Tibor A. Rauch, Robin S. Kramer, Simone L. Macmil, Graham B. Wiley, Lynda B. Bennett, Jennifer L. Schnabel, Kristen H. Taylor, Sun Kim, Dong Xu, Arun Sreekumar, Gerd P. Pfeifer, Bruce A. Roe, Charles W. Caldwell, Kapil N. Bhalla, and Huidong Shi “Genome-Wide DNA Methylation Maps in Follicular Lymphoma Cells Determined by Methylation-Enriched Bisulfite Sequencing,” PLoS ONE, 5(9): e13020, 2010. Available at http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013020
- Assistant Professor, Department of Biostatistics, GRU Cancer Center, and College of Graduate Studies, Medical College of Georgia, Georgia Regents University
- Member, International Society for Computational Biology
- Member, Daphnia Genomics Consortium (DGC)
- Member, Nasonia Genome Working Group