What are AAVs? and what makes them good candidates for gene therapy?

By: Christopher Reardon, Stephen Malina, and Eryney Marrogi


In the first part of our three part series introducing AAV as a gene therapy vector, we talked about basic AAV vector biology. In this post, we’re going to take a step back to answer the question of “Why AAV?” and look at some opportunities in the AAV engineering space.

Why AAV?

Viral vectors are one of the three main classes of gene therapy delivery vehicles. The other two are lipid nanoparticles (LNPs), now famous for their roles in mRNA vaccines, and plasmid electroporation. Relative to these other options, viral vectors have the advantage of higher delivery efficiency, better targeting, and a long clinical history starting with their use in vaccines and more recently, in gene therapy.

Three main viral vectors exist within the viral vector gene therapy landscape [1]: lentiviruses, adenoviruses, and AAVs. With respect to their use in gene therapy, we can assess viral vectors in terms of their safety, production robustness and scalability, ability to target specific tissues (tropism), and packaging capacity. Each makes various trade-offs with respect to these characteristics. 

Lentiviruses are single-stranded RNA viruses originally derived from natural retroviruses. When used for gene therapy, lentiviruses have the advantage of a large package capacity (~9 kilobases) but present challenges related to genome integration [2]. Adenoviruses are double-stranded DNA viruses best known for causing the common cold. The newest generation of adenoviruses can package large transgenes of length up to ~36kb and combine high transduction efficiency with scalable viral production. However, they tend to trigger strong immune responses in humans, which limits their applicability for gene therapy applications, particularly in immunocompromised patients [1].

Compared to adenovirus and lentivirus, AAV offers a low immune profile, relative safety, long-lasting expression in non-dividing cells, and a long scientific and clinical history. Two AAV-based gene therapies — Luxturna® and Zolgensma® — have already been approved by the FDA, and many more [4] are in various stages of clinical trials. However, AAV can only package ~4.7 kilobases of linear single-stranded DNA, meaning packaging full genes encoding large proteins (such as dystrophin) currently requires gene size optimization and/or spreading a gene across multiple vectors.

Opportunities in AAV engineering

While it’s impressive how well natural variants of AAV work for gene therapy, they lack certain features that would make them even more useful. As mentioned above, they can only package ~4.7kb, so fitting transgenes larger than 4.7kb into a vector requires clever, but error-prone, strategies. In addition, although different natural variants do weakly target certain tissue types, this targeting has not been optimized for precision. Finally, although the AAV capsid does not trigger a strong immune response, natural immunity to it can develop, which makes repeated dosing difficult by increasing the risk of potential side effects and reducing dosing efficiency.

To overcome these challenges, researchers have focused on improving the AAV capsid protein and designing promoters that modulate expression of the delivered gene. The first of these, capsid design, involves modifying the capsid gene to change its function. Existing work in this category has focused on:

  • Increasing packaging capacity beyond ~4.7kb,
  • Targeting (and de-targeting) specific tissues such as the brain [3] or liver, and
  • Improving the ability of the capsid to evade the immune system, even after repeated administration.

However, any approach to solving these problems must grapple with narrowing the space of possible capsid variants. Our goal in design is to find a set of promising variants that’s small enough to test in vitro/vivo (scale) while maximizing the fraction of capsids in this set which are viable (efficiency). However, achieving this poses a substantial challenge given that even a single VP1 monomer has 20^735 possible variants. Traditional approaches typically optimize for one of scale vs. efficiency.


For instance, directed evolution and unbiased random mutagenesis approaches test as many as 10^10 variants, which, while large, amounts to a drop in the bucket of an incredibly vast space that is many orders of magnitude larger. Additionally, in the case of the “massive” random mutagenesis libraries, the majority of tested variants end up nonviable, amounting to an extremely inefficient search. On the other hand, rational design approaches make targeted mutations to small sub-regions of the capsid based on hard-won biological knowledge, increasing efficiency at the cost of reducing scale. Rational design approaches excel at discovering small local perturbations to the capsid that increase fitness but lack the scale and throughput to identify mutations that take advantage of unexpected, potentially nonlinear effects of mutations to disjointed capsid sub-regions. At Dyno, we hope to find the right balance of efficiency and scale by combining machine-guided design with high-throughput screening. Although our high-throughput screens typically test between 10^4 and 10^5 variants per library, as we’ve shown in previous work, our machine learning models increase the efficiency of these libraries by vastly increasing the proportion of viable variants in the library [5, 6]. By directly modeling the relationship between sequence and function, we can further guide our search towards capsids that not only produce but also have optimized desired properties.


Taking a step back, it’s important to put AAV engineering in context with respect to the larger project of improving our ability to control cellular outcomes and behavior. Allowing ourselves to speculate a little, we can think of AAV as an impressive but imperfect starting point for a future generic in vivo gene delivery system. Looked at through this lens, any deficiency of AAV that prevents it from safely delivering genes (repeatedly) in all patients with arbitrary tissue and cell type specificity in a low dosing regime presents an opportunity for AAV engineering.

As we said in our prior post, this wouldn’t be a company blog post if we didn’t mention that we’re currently hiring for a range of technical and non-technical roles. If you’re excited about working with us, please apply and for the authors’ sake, mention that our blog post played a role in getting you excited about Dyno!

Special thanks to: Adam Poulin-Kerstein, Alex Brown, Cherry Gao, Eric Kelsic, Heikki Turunen, Jeff Gerold, and Sam Sinai for helpful comments. 

[1]: Bulcha, J.T., Wang, Y., Ma, H. et al. Viral vector platforms within the gene therapy landscape. Sig Transduct Target Ther 6, 53 (2021). https://doi.org/10.1038/s41392-021-00487-6

[2]: Milone, M.C., O’Doherty, U. Clinical use of lentiviral vectors. Leukemia 32, 1529–1541 (2018). https://doi.org/10.1038/s41375-018-0106-0 

[3]: Ravindra Kumar, S., Miles, T.F., Chen, X. et al. Multiplexed Cre-dependent selection yields systemic AAVs for targeting distinct brain cell types. Nat Methods 17, 541–550 (2020). https://doi.org/10.1038/s41592-020-0799-7

[4]: Kuzmin, Dmitry A., et al. “The clinical landscape for AAV gene therapies.” Nature reviews. Drug Discovery (2021).

[5]: Sinai, Sam, et al. “Generative AAV capsid diversification by latent interpolation.” bioRxiv (2021).

[6]: Bryant, D.H., Bashir, A., Sinai, S. et al. Deep diversification of an AAV capsid protein by machine learning. Nat Biotechnol 39, 691–696 (2021). https://doi.org/10.1038/s41587-020-00793-4

What are AAVs? and what makes them good candidates for gene therapy?

By: Stephen Malina, Eryney Marrogi, and Christopher Reardon.

In two previous posts, we introduced gene therapy, a method for curing genetic diseases by providing healthy copies of defective genes, and Adeno-associated virus (AAV) capsids, the gene therapy delivery system Dyno focuses on. In those posts, we also discussed how natural variants of AAV did not evolve for the specialized functions to which we now seek to apply them, which is why Dyno is applying machine learning and high-throughput techniques to better engineer AAV.

This post (the first in a three part series on AAV) provides an overview of AAV as a gene therapy vector, focusing primarily on the genetic and protein structure of an AAV capsid. Before diving in, we want to note upfront that this series is not intended to be a comprehensive overview of AAV research. Instead, it’s better thought of as an overview of selected AAV knowledge we at Dyno have learned and found helpful while working in this space. While all readers are welcome, in writing this post, we’ve tried to optimize for readers who may have a minimal biology background but no specific knowledge of virus biology and/or AAV but are eager to learn. In addition to summarizing AAV biology and engineering basics, we’ve given our perspective on some promising directions in AAV engineering and a sneak peek at Dyno’s AAV engineering workflow.

With that, let’s dive in.

AAV Vector Biology


Viruses evolved to infect cells and leverage their cellular machinery to make copies of themselves, and AAV is no exception. In order to accomplish this AAV first infects a cell, allowing its genetic material to get shuttled into the nucleus for transcription. Second, cellular machinery translates AAV’s viral genome into proteins which assemble into a viable viral shell with a copy of the viral genome packaged inside it. What distinguishes AAV from many other viruses is 1) its small genome size and 2) its inability to replicate itself in the absence of a helper virus, hence why the third step, “replicate exponentially and destroy the cell” is missing.

As a single-stranded DNA virus, AAV’s genome comprises 4.7 kb of linear DNA. This short DNA amazingly codes for at least 9 distinctly expressed proteins, which span a total unrolled length of over 12 kb. To achieve this, the genome also takes advantage of staggered and alternative start sites and splicing to create multiple proteins. In practice, this means that the same DNA nucleotides often simultaneously encode different segments of multiple proteins in the AAV genome. AAV’s small size is a double-edged sword from our perspective as AAV engineers. On one hand, its simplicity makes understanding its function more tractable. On the other hand, its heavy use of DNA-overlap-tricks makes engineering more difficult because a single nucleotide mutation can impact multiple proteins.

As mentioned, AAV lacks the full set of genes required to make copies of itself in a cell (it is replication incompetent). In addition to relying on the genetics and cellular machinery of its host organism, it needs the genes of the Adenovirus (hence the name adeno-associated virus) in order to replicate. This replication incompetence makes AAV useful for gene therapy. Gene therapy vectors are intended to deliver therapeutic genes without making copies of themselves. Whereas other vectors require careful engineering to handicap their ability to replicate in vivo, with AAV we get this property automatically.

Amongst the genes that are involved in AAV’s functions, the two most critical are rep and cap. The rep gene encodes four proteins and plays a key role in enabling AAV replication. For the purpose of gene therapy, rep is only used during the AAV production stage. The cap gene, short for capsid, encodes three vector proteins VP1, VP2 and VP3, which combine to form the shell of the AAV capsid (see next section for additional details). In addition to its coding regions, the ends of the AAV ssDNA sequence are flanked by two inverted terminal repeats (ITRs) made up by repeated sequences that self-complement, thus allowing the structure to fold and provide stability to each end of the genome as a defense against degradation. The ITRs also play a key role in integration and rescue to and from the host cells, loading of the genome into the AAV capsid particle, and even act as promoters for second strand synthesis and protein expression (source).

When converting a natural AAV into a gene therapy vector, its genome is dissected, manipulated, and reassembled to make room to include therapeutic genes (transgenes). Research into the basic biology of AAV has enabled transformation of the natural AAV genome into a safe and effective gene therapy vector.




Model of AAV with all monomers assembled (left) and a single composite of the three AAV capsid monomers (VP1/2/3) (right), with coloration by secondary structure.

AAV’s prominence in gene therapy can also be understood by examining its structure (visually depicted above). Out of its two genes, the cap gene has a larger influence on the structure-function relationship. To start, cap expresses structural proteins VP1, VP2 and VP3, which interact to form the viral capsid from sixty copies of the VP monomers. Beyond contributing to the iconic icosahedral form, cap plays a critical role in determining tropism, the virus’ ability to infect a particular cell or organ type. Due to the capsid protein’s importance in capsid attachment to cellular receptors involved in tissue tropism, manipulation of the cap gene appears to be the best route to the selective tropism needed for gene therapy applications. 



AAV engineering focuses heavily on addressing the interaction of AAV with host cells in an attempt to manipulate viral entry, i.e. transduction. While each AAV serotype might interact with unique receptor proteins, all serotypes (natural variants) follow the same general mechanism (depicted diagrammatically below) for entering and transducing a cell. For example, one of the most studied serotypes, AAV2, identifies viable target cells through cellular receptor heparan sulfate proteoglycan (HSPG). Once the capsid finds a valid binding site, AAV2 enters the cell via cell-mediated endocytosis through clathrin-coated pits, and eventually escapes its endosome somewhere within the cytoplasm to attempt entry into the nucleus. Finally, the viral vector enters the nucleus through the nuclear pore complex, where transcription takes place [1]. 

After establishing itself within the nucleus, AAV relies on host-cell mechanisms for both genome replication and, in the case of engineered viruses, transgene expression (conversion into proteins). When designing capsids for therapeutic use, engineering the capsid can help the AAV find a target host cell, and engineering other components of the AAV can further improve the therapeutic. For example, AAV constructs can be selectively expressed within specific cell types by careful promoter expression, allowing increased control of engineered viruses in a therapeutic context. The right promoter, within the context of AAV, can dictate when and where an AAV construct starts expressing, ultimately contributing to making the therapy safe and effective. Promoter selection, coupled with capsid engineering are our main tools when making AAV a powerful tool for gene therapy.

Source: Engineering adeno-associated virus vectors for gene therapy [2]


During its 50 year history, the field of AAV biology has built up an impressive edifice of knowledge about AAV’s genome, structure, function, and engineering. We’ve only scratched the surface of each of these topics in this post, but we have hopefully provided enough of an overview to help you understand the potential of engineered AAV as a gene therapy vector and some of the challenges the field needs to overcome in order to realize this potential. 

Finally, this wouldn’t be a company blog post if we didn’t mention that we’re currently hiring for a range of technical and non-technical roles. If you’re excited about working with us, please apply and for the authors’ sake, mention that our blog post played a role in getting you excited about Dyno!

Special thanks to: Adam Poulin-Kerstein, Alex Brown, Cherry Gao, Eric Kelsic, Heikki Turunen, Jeff Gerold, and Sam Sinai for helpful comments. 


[1]: Martini, S. V., P. R. M. Rocco, and M. M. Morales. “Adeno-associated virus for cystic fibrosis gene therapy.” Brazilian Journal of Medical and Biological Research 44 (2011): 1097-1104.

[2]: Li, C., Samulski, R.J. Engineering adeno-associated virus vectors for gene therapy. Nat Rev Genet 21, 255–272 (2020). https://doi.org/10.1038/s41576-019-0205-4


Back to Top