Skip to main content

Andrew F. Neuwald, PhD

Academic Title:

Adjunct Professor

Primary Appointment:

Biochemistry and Molecular Biology


Health Sciences Facility III, 670 West Baltimore St, Baltimore 21201


(410) 706-1481

Education and Training

  • University of Wisconsin-Eau Claire, B.S., Medical Technology, 1978    
  • University of Wisconsin-Eau Claire, M.S., Biology, 1983                       
  • University of Iowa, Ph.D., Microbiology, 1987               
  • Washington University, M.S., Computer Science, 1992
  • Post Doctoral Fellowships: Molecular Microbiology, Washington University, 1989; Statistics & Computational Biology, the National Center for Biotechnology Information, 1997


Rigorous statistical methods are required to distinguish between subtle biological information and random noise within protein sequence data.  Dr. Neuwald and collaborators Jun S. Liu (Harvard University) and C.S. Lawrence (Brown University) were the first to address this problem using Bayesian Markov chain Monte Carlo sampling—an approach specifically designed for such a statistical and algorithmic challenge.  They were also the first to develop genetic algorithm-based Bayesian multiple sequence alignment strategies. Dr. Neuwald has applied these programs to various biological problems. One important early discovery was that Barth syndrome, an inherited cardiomypathic disorder, is due to an acyltransferase deficiency. This led to clinical confirmation and to potential treatments for this disease. Likewise, using these approaches, he structurally and functionally defined the AAA+ class of chaperone-like ATPases, which led to many follow up studies and more than 1,250 citations.  His other studies led to the discovery of unanticipated protein internal repeats, including subtle β-propeller-like repeats in UV-damaged DNA-binding protein and HEAT repeats in certain chromosome condensation components, the latter leading to the discovery (in collaboration withTatsuya Hirano at Cold Spring Harbor Laboratory) of a new vertebrate condensin component.

Dr. Neuwald was also the first to formulate (in collaboration with Jun S. Liu) and implement Bayesian strategies to infer likely determinants of underlying biochemical functions based on correlations in protein sequences. He has applied this approach to P-loop GTPases, protein kinases, N-acetyltransferases, AAA+ ATPases and DNA clamps.  In particular, analyses of protein kinases (In collaboration with Natarajan Kannan at the Univ. of Georgia) have led to multiple follow up experimental studies. More recently, Dr. Neuwald has developed and implemented multidimensional Bayesian sampling programs for the creation, optimization and enhancement of protein domain hierarchies; the National Center for Biotechnology Information (NCBI) has integrated these into their Conserved Domain Database pipeline.  Most recently, he has developed (in collaboration with Stephen Altschul at the NCBI) and applied statistical methods for identifying biologically important protein structural features.  He is now focusing on making the information gleaned through these statistical approaches widely available to biomedical researchers over the World Wide Web.

Research/Clinical Keywords

Protein sequence and structural analysis, Bayesian statistics, Computational Biology

Highlighted Publications

Neuwald, A.F. 2014. A Bayesian sampler for optimization of protein domain hierarchies. Journal of Computational Biology 21(3): 269-286.

Neuwald, A.F. 2016. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Current Opinion in Structural Biology. 38:1-8.

Neuwald, A.F and S. F. Altschul. 2016. Bayesian Top-down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties. Plos Comp. Biol. 12(5): e1004936.

Neuwald, A.F and S. F. Altschul. 2016. Inference of Functionally-Relevant N-Acetyltransferase Residues Based on Statistical Correlations. Plos Comp. Biol.  in revision.

Additional Publication Citations

Neuwald A.F. 1997. Barth syndrome may be due to an acyltransferase deficiency. Current Biology 7: R465-R466.

Neuwald A.F., Aravind L., Spouge J. L. and Koonin E. V. 1999. AAA+: a class of chaperone-like ATPases associated with the assembly, operation and disassembly of protein complexes. Genome Research 9:27-43.

Neuwald A.F. and Poleksic A. 2000. PSI-BLAST searches using hidden Markov models of structural repeats: Prediction of an unusual sliding DNA clamp and of b-propellers in UV-damaged DNA binding protein. Nucleic Acids Research 28(18): 3570-3580.

Neuwald A.F. and Hirano T. 2000. HEAT repeats associated with condensins, cohesins, and other complexes involved in chromosome-related functions. Genome Research 10(10): 1445-1452.

Neuwald A.F., Kannan N., Poleksic A., Hata N., and Liu J.S. 2003. Ran’s C-terminal, basic patch and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras and Ran GTPases. Genome Research 13(4): 673-692.

Neuwald A.F. 2003. Evolutionary clues to DNA polymerase III β clamp structural mechanisms. Nucleic Acids Research 31(15): 4503-4516.

Ono T., Losada A., Hirano M., Myers M.P., Neuwald A.F. and Hirano T. 2003. Differential contributions of condensin I and condensin II to mitotic chromosome architecture in vertebrate cells. Cell 115: 109-121.

Kannan N. and Neuwald A.F. 2004. Evolutionary constraints associated with functional specificity of the CMGC protein kinases MAPK, CDK, GSK, SRPK, DYRK, and CK2α. Protein Science 13(8): 2059-2077.

Kannan N. and Neuwald A.F. 2004. Evolutionary constraints associated with functional specificity of the CMGC protein kinases MAPK, CDK, GSK, SRPK, DYRK, and CK2α Protein Science 13(8): 2059-2077.

Saitoh N., Spahr C.S., Patterson S.D., Bubulya P., Neuwald A.F. and Spector D.L. 2004. Proteomic analysis of interchromatin granule clusters. Mol. Biol. Cell. 15(8): 3876-3890.

Neuwald A.F. and Liu J.S. 2004. Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model. BMC Bioinformatics 5: 157 (16 pages).

Neuwald A.F. 2005. Evolutionary clues to eukaryotic DNA clamp-loading mechanisms: analysis of the functional constraints imposed on replication factor C AAA+ ATPases. Nucleic Acids Research 33:3614-3628

Kannan N. and Neuwald A.F. 2005. Did protein kinase regulatory mechanisms evolve through elaboration of a simple structural component? Journal of Molecular Biology 351: 956-972.

Neuwald A.F. 2006. Bayesian shadows of molecular mechanisms cast in the light of evolution. Trends in Biochemical Sciences 31(7): 374-382. Reviews the statistical and

Neuwald A.F. 2006. Hypothesis: bacterial clamp loader AAA+ ATPase activation through DNA-dependent repositioning of the catalytic base and of a trans-acting catalytic threonine. Nucleic Acids Research 34(18): 5280-5290.

Kannan N., Haste N., Taylor S. S. and Neuwald A.F.. 2007. The hallmark of AGC kinase functional divergence is its C-terminal tail, a cis-acting regulatory module. Proc. Natl. Acad. Sci., USA 104(4):1272-1277.

Neuwald A.F.. 2007. The CHAIN program: forging evolutionary links to underlying mechanisms. Trends in Biochemical Sciences 32: 487-493.

Neuwald A.F. 2007. Gα-Gβγ dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg-Trp pair. Protein Science 16(11): 2570-2577.

Kannan, N., A.F. Neuwald and S. S. Taylor. 2007. Analogous regulatory sites within the αC-β4 loop regions of ZAP70 tyrosine kinase and AGC kinases Biochimica et Biophysica Acta (BBA)-Proteins & Proteomics 1784(1):27-32.

Kannan, N., J. Wu, G. S. Anand, S. Yooseph , A. F. Neuwald, J. C. Venter and S. S. Taylor. 2008. Evolution of allostery in the cyclic nucleotide binding module. Genome Biology 12(8): R264.

Neuwald, A.F. 2009. The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC Structural Biology 9: 11.

Neuwald, A.F. 2009. The charge-dipole pocket: a defining feature of signaling pathway GTPase on-off switches. Journal of Molecular Biology 390: 142-153. Reports an important structural component distinquishing signaling pathway GTPase on-off switches from other P-loop GTPases.

Neuwald, A.F. 2009. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. Bioinformatics 25: 1869-1875

Ammerman, N.C., J. J. Gillespie, A. F. Neuwald, B. W. Sobral, A. F. Azad. 2009. A typhus group-specific serine protease defies the nature of reductive evolution in Rickettsia. Journal of Bacteriology 191:7609-7613.

Iskow, R., McCabe, M., Mills, R., Torene, S., Pittard, W. S., Neuwald, A.F., Van Meir, E., Vertino, P. and S. E. Devine. 2010. Natural mutagenesis of human genomes by endogenous retrotransposons. Cell 141(7):1253-1261.

Neuwald, A.F. 2010. Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. Biology Direct 5: 66 (17 pages).

Neuwald, A.F. 2011. Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Statistical Applications in Genetics and Molecular Biology 10(1): Article 36. (30 pages)

Neuwald, A.F., Lanczycki, C.J. and A. Marchler-Bauer. 2012. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinformatics 13:144.

Neuwald, A.F. 2014. Evaluating, comparing and interpreting protein domain hierarchies. Journal of Computational Biology 21(4): 287-302.

Neuwald, A.F. 2014. Protein Domain Hierarchy Gibbs Sampling Strategies. Statistical Applications in Genetics and Molecular Biology 13(4):497-517.

Oruganty K., Talevich E.E., Neuwald A.F., Kannan N. 2016. Identification and classification of small molecule kinases: insights into substrate recognition and specificity. BMC Evol Biol. 16(1):7.

Awards and Affiliations

Mitchell prize for the best Bayesian application paper, 2000

University of Wisconsin-Eau Claire President's Award, 2006

1997 - 2001, Assistant Investigator/Professor, Cold Spring Harbor Laboratory, NY

2001 - 2006, Associate Professor, Cold Spring Harbor Laboratory, NY

Grants and Contracts

9/13-8/14         Role:   PI    (65 %) 

                        ”NCBI Software Support for Molecular Biology and Genomics Information

                        Resource”  NIH contract (HHSN27600001)

                        $204,428 awarded (total direct and indirect costs)


6/12-9/13         Role:   PI    (65 %) 

                        ”NCBI Software Support for Molecular Biology and Genomics Information

                        Resource”  NIH contract (HHSN27600002)

                        $291,279 awarded (total direct and indirect costs)


9/06-8/11         Role:   PI    (90 %) 

                        ”Predicting Common Protein Mechanisms by the Light of Evolution”

                        NIH R01 Grant (GM078541) (last year a no cost extension)

                        $1,174,834 awarded (total direct and indirect costs)


9/01-8/07         Role:    PI   (17-100 %) 

                        “Advanced Sequence-Based Prediction of Protein Function”

                        NIH R01 Grant (LM06747)

                        $1,125,000 awarded (total direct and indirect costs)


9/98-8/01         Role:  PI    (50-100 %)           

                        “Advanced Sequence-Based Prediction of Protein Function”

                        NIH R01 Grant (LM06747)

                        $904,300 awarded (total direct and indirect costs)


9/98-8/01         Role:  Co-PI  (% effort not applicable)

                        NSF Major Instrumentation Grant (9871174)

                        $113,615 awarded for purchase of a multiprocessor computer


2002                Role:   PI    (% effort not applicable)

                        “Advance Sequence-Based Prediction of Protein Function”

                        NIH Supplemental (instrumentation) Grant (LM06747-04S1)

                        $151,953 awarded