VarSome CNV Classification

(c) Copyright Saphetor SA. All rights reserved.

version: 13.1.2, dated: 15 Mar 2025 06:08:40 UTC

Introduction

The ”Technical standards for the interpretation and reporting of constitutional copy-number variants” was published in 2019 by Eric Rooney Rigs et al. in their paper (ACMG CNV Guidelines). The VarSome automated CNV classifier implements an interpretation of the guidelines presented in this paper and generates recommended pathogenicity based on the data available from multiple genomic databases.

Important: in the process of testing & calibrating this classifier we have adjusted the scores used by certain rules and tweaked the methodology in some cases. These changes significantly improve the overall accuracy of classification against our clinical benchmarks.

Our guiding principle throughout, following the advice of our clinical advisors, has been to implement a rigorously evidence-based approach to identifying potential pathogenic copy number variants. We have leveraged a wide range of public-domain databases.

All the rules provide clear natural language explanations of why they were triggered and which evidence was used, or, conversely, a full explanation of why the criteria were not met.

We strive to continuously improve our implementation, adjusting algorithms, incorporating new data sources, and adding refinements as new publications and methodology changes are suggested. We greatly appreciate feedback from the huge VarSome user community, and always aim to promptly act on any suggestions.

Approach & Overview

Our implementation considers the following types of evidence according to the ACMG Technical Standards, each of these is given a simple name for convenience which is displayed in VarSome

  • Content: Evaluation of impact based on the number of protein coding elements affected.
  • Overlap: Overlap with established/predicted haploinsufficient (HI), triplosensitive (TS) or established benign genes/genomic regions.
  • Gene: Evaluation of impact based on the number of genes affected.
  • Literature: Detailed evaluation of genomic content using published literature, public databases, and/or internal lab data.
  • Inheritance: Evaluation of inheritance pattern/family history for patient being studied.

The evidence from all these sources is combined to reach an overall recommended classification.

Sample Information

VarSome's CNV classifier is able to leverage data from the sample itself in order to provide additional findings to the clinician and help prioritize which variants to review.

Supplementary Information

The ACMG CNV Standard is shown for reference only.

Matching same-type CNVs across clinical sources
Loss CNV The reported CNV is benign and fully overlaps the given CNV or it is pathogenic and it is fully contained inside the given CNV
Gain CNV The reported CNV affects the same coding genes as the given CNV, it covers at least 0.85 times the given CNV and it's smaller than 1.25 times the given CNV.

CNV Frequency calculation using gnomAD structural variants
Loss CNV The computed frequency is the sum of the frequencies for all fully overlapping CNVs reported
Gain CNV The computed frequency is the sum of the frequencies for all the reported CNVs that affect the same coding genes, cover at least 0.85 times the given CNV and are smaller than 1.25 times the given CNV.

"Content": Evaluation of impact based on the number of protein coding elements affected.

This rule performs an initial assessment of the genomic content affected by the CNV, by checking if it contains any protein-coding elements.

ACMG CNV Standard
Loss or Gain CNV Score
CNV affects at least one protein-coding element 0
CNV doesn't affect any protein-coding element -0.6

Saphetor Custom Implementation
Loss CNV Score
CNV affects at least one or more protein-coding element 0
CNV doesn't affect any protein-coding element -0.95
Gain CNV Score
CNV affects at least one or more protein-coding element 0

"Overlap": Overlap with established/predicted haploinsufficient (HI), triplosensitive (TS) or established benign genes/genomic regions.

This rule checks how the structural variant affects any haploinsufficient/triplosensitive genes or regions

ACMG CNV Standard
Loss CNV Score
CNV fully overlaps with a haploinsufficient gene/region 1
CNV intersects with the coding sequence of a haploinsufficient gene 1
CNV affects the last exon as well as other exons of a haploinsufficient gene 1
CNV affects the last exon of a haploinsufficient gene 0.9
CNV affects a UTR'5 of a haploinsufficient gene 0
CNV affects a last exon of a haploinsufficient gene 0.3
CNV contains at least one gene where loss-of-function is a known mechanism of disease 0.9
CNV is fully contained in a known benign CNV -1
Gain CNV Score
CNV fully overlaps with a triplosensitive gene/region 1
CNV intersects with a triplosensitive gene/region 0
CNV affects same set of coding genes as a known benign CNV -1
CNV fully overlaps with a haploinsufficient gene/region 0
CNV fully overlaps at least one gene where loss-of-function is a known mechanism of disease 0.9

Saphetor Custom Implementation
Loss CNV Score
CNV contains at least one known pathogenic LOF variant, affects either a haploinsufficient or a LOF coding gene region and one of the following is true
CNV contains 1 or more haploinsufficient genes 1
CNV contains 1 or more haploinsufficient regions 1
CNV contains 12 or more genes where loss-of-function is a known mechanism of disease 1
CNV affects 13 or more genes where loss-of-function is a known mechanism of disease 1
CNV contains 65 or more LOF known pathogenic variants 1
CNV affects 1 or more haploinsufficient genes 1
CNV affects 1 or more haploinsufficient regions 1
CNV contains 5 or more LOF known pathogenic variants 0.95
CNV has both breakpoints in a gene where loss-of-function is a known mechanism of disease 0.95
CNV contains at least one known pathogenic LOF variant, but doesn't affect either a haploinsufficient or a LOF coding gene region and one of the following is true
It affects 2 or more haploinsufficient regions 0.95
It affects at least 10 LOF variants 0.95
CNV doesn't contain any known pathogenic LOF variant and one of the following is true
It affects 2 or more haploinsufficient genes 0.95
It has a breakpoint in a LOF Gene. 0.95
No condition is met 0
Gain CNV Score
CNV contains at least one known pathogenic LOF variant, affects a haploinsufficient, LOF or triplosensitive coding gene region and one of the following is true
CNV contains 1 or more triplosensitive genes 1
CNV contains 1 or more triplosensitive regions 1
CNV contains 1 or more haploinsufficient genes 1
CNV contains 1 or more haploinsufficient regions 1
CNV contains 3 or more genes where loss-of-function is a known mechanism of disease 1
CNV affects 4 or more genes where loss-of-function is a known mechanism of disease 1
CNV affects 1 or more triplosensitive regions 0.95
CNV affects 1 or more triplosensitive genes 0.95
CNV affects 2 or more haploinsufficient genes 1
CNV affects 2 or more haploinsufficient regions 1
CNV has both breakpoints in a known haploinsufficient gene coding region 1
CNV affects at least 50 LOF variants 0.95
CNV contains at least one known pathogenic LOF variant, but doesn't affect a haploinsufficient, LOF or triplosensitive coding gene region and it affects 1 or more triplosensitive regions 0.95
CNV doesn't contain any known pathogenic LOF variant and it affects 4 or more triplosensitive regions 0.95
No condition is met 0

"Gene": Evaluation of impact based on the number of genes affected.

This rule scores the CNV based on the number of protein-coding genes that are partially or wholly affected.

ACMG CNV Standard
Loss CNV Score
CNV affects more than 34 protein-coding genes 0.9
CNV affects more than 24 protein-coding genes 0.45
CNV affects 24 or less protein-coding genes 0
Gain CNV Score
CNV affects more than 39 protein-coding genes 0.9
CNV affects more than 34 protein-coding genes 0.45
CNV affects 34 or less protein-coding genes 0

Saphetor Custom Implementation
Loss CNV Score
CNV doesn't affect any protein-coding gene -1
CNV affects 12 or more protein-coding genes 1
CNV affects 10 or more protein-coding genes 0.95
CNV affects 8 or more protein-coding genes 0.6
CNV affects less than 8 protein-coding genes 0
Gain CNV Score
CNV doesn't affect any protein-coding gene -1
CNV affects 33 or more protein-coding genes 1
CNV affects 11 or more protein-coding genes 0.95
CNV affects 6 or more protein-coding genes 0.6
CNV affects less than 6 protein-coding genes 0

"Literature": Detailed evaluation of genomic content using published literature, public databases, and/or internal lab data.

This rule evaluates the public literature, databases and internal lab data.

ACMG CNV Standard
Loss or Gain CNV Score
There is at least one pathogenic CNV and no benign CNV found in clinical sources 1
The reported pathogenic CNVs are at least 3 times the benign ones 0.95
There is at least one benign CNV and no pathogenic CNV found in clinical sources -1
The reported benign CNVs are at least 3 times the pathogenic ones -0.95
There is an overlap with a common population variant reported by DGV -1
The reported frequency by gnomAD structural variants is greater than 0.01 -1

Saphetor Custom Implementation
Loss or Gain CNV Score
There are more than one pathogenic CNVs, no benign CNVs and no common CNVs found in clinical sources 1
There is one pathogenic CNV and no benign CNVs and no common CNVs found in clinical sources 0.6
There is at least one pathogenic CNV, no benign CNVs, but at least one common CNVs found in clinical sources 0.6
The reported pathogenic CNVs are at least 2.9 times the benign ones and there are no common CNVs 0.95
The reported pathogenic CNVs are at least 2.9 times the benign ones and there is at least one common CNV 0.6
There are more than one benign CNVs and no pathogenic CNVs found in clinical sources -1
There is one benign CNV, at least one common CNV and no pathogenic CNVs found in clinical sources -1
There is one benign CNV, no common CNVs and no pathogenic CNVs found in clinical sources -0.6
There is at least one common CNV, no benign CNVs and no pathogenic CNVs found in clinical sources -1
The reported benign CNVs are at least 2.9 times the pathogenic ones and there are is at least a common CNV reported -1
The reported benign CNVs are at least 2.9 times the pathogenic ones and there are no common CNVs reported -0.95
There is not a significant difference between clinical observations, but there is at least one common CNV reported -0.6
There is not a significant difference between clinical observations, and there is no common CNV reported 0

Common CNVs are considered the ones reported by DGV and they have more than one reported sample.

"Inheritance": Evaluation of inheritance pattern/family history for patient being studied.

This rule evaluates the inheritance pattern/family history of the sample being studied.

ACMG Standard
Inheritance information unavailable or uninformative 0
CNV segregates with a consistent phenotype observed in sample's family 0.3
There is a lack of segregation for the CNV and the sample's family -0.3
CNV is not found in another individual in the proband's family affected with the same phenotypes. -0.45
Reported phenotypes are not consistent with the affected genes within this region -0.15
Reported phenotypes are consistent with the affected genes within this region and the variant is confirmed denovo 0.15
Reported phenotypes are consistent with the affected genes within this region and the variant is NOT confirmed denovo 0.3
Reported phenotypes are consistent with similar clinical cases 0.3

Saphetor Custom Implementation
Inheritance information unavailable or uninformative 0
CNV segregates with a consistent phenotype observed in sample's family 0.3
There is a lack of segregation for the CNV and the sample's family -0.3
CNV is not found in another individual in the proband's family affected with the same phenotypes. -0.45
Reported phenotypes are not consistent with the affected genes within this region -0.15
Reported phenotypes are consistent with the affected genes within this region and the variant is confirmed denovo 0.15
Reported phenotypes are consistent with the affected genes within this region and the variant is NOT confirmed denovo 0.3
Reported phenotypes are consistent with similar clinical cases 0.3

Databases

The VarSome automated classification processes rely on vast quantities of accurate curated data from the following databases (in no particular order).

Important:depending on licensing agreements and in some cases the fees charged by source organisations, not all databases are visible to all users, and this may directly impact the completeness or quality of automated classifications.

Databases used when classifying CNVs

  1. UniProt Variants, provided by UNIPROT, version 07-Feb-2025 (72.5k records)
  2. UniProt Regions, provided by UNIPROT, version 07-Feb-2025 (283k records)
  3. DGV, provided by TCAG, version 30-Jun-2021 (792k records)
  4. RefSeq, provided by NCBI, version 228
  5. Mitomap, provided by CHOP, version 08-Dec-2023 (39.0k records)
  6. HPO, version 07-Feb-2025 (19.0k records)
  7. ClinVar CNVs, provided by NCBI, version 07-Feb-2025 (61.6k records)
  8. ClinVar, provided by NCBI, version 07-Feb-2025 (3.22M records)
  9. ClinGen Regions, provided by NIH, version 07-Feb-2025 (516 records)
  10. ClinGen CNVs, provided by NIH, version 07-Feb-2025 (156 records)
  11. ClinGen, provided by NIH, version 07-Feb-2025 (1.56k records)
  12. CGD, provided by NHGRI, version 03-Jul-2024 (4.74k records)
  13. dbNSFP genes, provided by dbNSFP, version 4.9 (21.5k records)
  14. dbVar, provided by NCBI, version 03-Jul-2024 (3.06M records)
  15. DECIPHER, provided by Sanger, version 07-Feb-2025 (31.0k records)
  16. Ensembl, provided by EMBL, version 113
  17. ExacCNV, provided by Broad, using version 01-Jul-2021 (49.3k records) for hg19, and using version 20180227 (48.6k records) for hg38
  18. GHR Genes, provided by NLM, version 05-Dec-2024 (1.50k records)
  19. gnomAD structural variants, provided by Broad, version 30-Jun-2021 (334k records)