Clinical NLGN1 Variant Submissions Do Not Support a Simple Splice-Site Hotspot Model
Neuroligin-1 (NLGN1) is often discussed through two narrow lenses: extracellular neurexin-binding biology and autism-associated clinical variation. That framing suggests an attractive hypothesis: clinically submitted NLGN1 protein variants might concentrate in splice-regulated or extracellular adhesion regions that tune neuroligin-neurexin function. We tested that hypothesis directly using public variant and protein-annotation resources. ClinVar contained 169 NLGN1 records, including 43 protein-change submissions that could be mapped unambiguously to the UniProt Q8N2Q7 coordinate system by reference-amino-acid matching. As a background catalogue, we mapped 603 Ensembl/dbSNP missense variants to the same protein coordinate system using Ensembl VEP. We then compared clinical and background variant localization across the extracellular domain, cholinesterase-like core, splice-site A segment, glycosylation/disulfide-proximal windows, juxtamembrane segment, transmembrane helix, and cytoplasmic tail. The public data did not support a simple clinical hotspot model. ClinVar mapped variants were not enriched in the extracellular domain, cholinesterase-like core, splice-site A segment, or post-translational/structural windows relative to dbSNP missense variants. Although ClinVar variants were closer to splice-site A by median distance, only one mapped variant lay inside the A segment. UniProt-curated functional natural variants also spanned both extracellular and cytoplasmic positions. These results argue that current public clinical submissions are too sparse and VUS-heavy to justify a splice-site-centered clinical interpretation of NLGN1. Mechanistic claims about NLGN1 variants should remain variant-specific until patient-level genotype-phenotype data or functional assays provide stronger evidence.
Reviews
This manuscript asks a well-posed, falsifiable question: do clinically submitted NLGN1 protein variants cluster in splice-regulated/extracellular regions (e.g., the A splice segment or adhesion/cholinesterase-like core) more than a background set? The approach—mapping ClinVar protein-change submissions onto UniProt Q8N2Q7 coordinates, constructing a comparison catalogue of missense variants from Ensembl/dbSNP via VEP, and comparing counts/distances across annotated protein regions—is conceptually appropriate and the negative result is plausibly informative given the current state of public submissions. The conclusion is appropriately restrained: current public clinical submissions are sparse and VUS-heavy, so a simple hotspot/splice-centric interpretive model is not justified. The main weakness is methodological and statistical specificity. From the excerpt, it is unclear (i) how isoforms/transcripts were handled beyond “reference-amino-acid matching,” (ii) how multiple ClinVar submissions for the same variant were de-duplicated, (iii) what the precise enrichment test/model was (or whether any formal test was performed), and (iv) whether the chosen background (dbSNP/Ensembl missense) is an appropriate null given strong ascertainment differences between population variation and clinically submitted variants (coverage, filtering, allele frequency, constraint, and reporting practices). Region definitions such as “glycosylation/disulfide-proximal windows” also introduce researcher degrees of freedom unless pre-registered or clearly justified, and the small mapped ClinVar set (n=43) limits power. Overall, the central conclusion (no strong evidence for a simple splice-site hotspot in currently available submissions) is reasonable, but claims about “not enriched” should be phrased as “no detectable enrichment under the chosen mapping/definitions” unless accompanied by explicit statistics and sensitivity analyses.