Recently, Homburger et al. used a spatial scan statistic to identify three-dimensional protein regions that are enriched in pathogenic variants and depleted in benign variants. Following their example, we classified human G6PD variants found in the reference population as benign (figure 1A). In addition, many of these variants were predicted to be benign by two different variant prediction algorithms, PolyPhen2 and SIFT.
Briefly, the spatial scan statistic is calculated by defining a sphere (we used a radius of 15 Å) centered at each residue in the protein’s crystal structure and then comparing the number of pathogenic and benign variants inside the sphere to the number of pathogenic and benign variants outside the sphere (figure 1B). To capture patterns of variation across oligomeric interfaces, we used the tetrameric structure of G6PD -(PDB: 1QKI). We calculated the spatial scan statistic three times, using either class I (<10% G6PD activity, severe phenotype), class II (<10% activity, mild phenotype), or class III (10–60% activity, mild phenotype) as pathogenic variants. (Spatial scan statistic values and p-values are included in the supplementary data.) The statistic was mapped onto the 3D structure of G6PD using a colored scale for ease of visualization (figure 1C–E).
We observed the strongest enrichment of class I variants around the dimer interface and structural NADP+ binding site (figure 1C), which is an allosteric site important for the stability of G6PD. Conversely, class II and class III variants were depleted in the dimer interface (figure 1D–E, middle), thus confirming that dimerization of G6PD is essential for activity because variants disrupting dimerization lead to the most severe loss of activity and clinical phenotype of chronic anemia (class I).
Class II variants were strongly enriched in an alpha -helix that supports and partially forms the tetramer interface (figure 1D, right), whereas class III variants were not enriched in this region (figure 1E, right). This means that variants at or near the tetramer interface, which likely disrupt tetramerization, result in a more severe G6PD deficiency phenotype (class II as compared with class III). We conclude that disruption of tetramerization leads to very low (<10%) G6PD activity, and therefore tetramerization is important for G6PD activity. Based on the previous finding that tetramerization protects G6PD against inhibition by NADPH, it is possible that variants that disrupt tetramerization may allow NADPH-induced inhibition of G6PD activity, leading to <10% G6PD activity and a class II G6PD deficiency phenotype. Additionally, although G6PD dimerization is sufficient for catalytic activity in vitro, it is possible that tetramerization improves catalytic efficiency and/or stability of the enzyme under physiological conditions. (Note that protein stability is particularly important in anucleate erythrocytes that have a half-life of 100–120 days and minimal or no de novo synthesis.)
Class III variants were enriched in the catalytic domain, peripheral to and not directly overlapping with the catalytic pocket. This suggests that class III variants reduce catalytic activity by directly affecting the conformation of the catalytic pocket and likely have few other structural or allosteric consequences, leading to a mild clinical phenotype.
This analysis also provides a structural explanation for why class I and II variants, which are defined by the same activity range (<10% compared to normal), lead to different clinical outcomes: class I causing chronic anemia and class II causing episodic, trigger-induced anemia. As previously shown, class I variants generally have low protein stability whereas class II variants generally have low catalytic activity. This analysis reveals the structural elements that likely contribute to these different biochemical properties: the structural NADP+ binding site and dimer interface, which are enriched in class I variants, likely contribute to protein stability, whereas the tetramer interface, which is enriched in class II variants, likely contributes to catalytic activity.