Get desktop application:
View/edit binary Protocol Buffers messages
* Variant origin. `SO_0001781`: de novo variant. http://purl.obolibrary.org/obo/SO_0001781 `SO_0001778`: germline variant. http://purl.obolibrary.org/obo/SO_0001778 `SO_0001775`: maternal variant. http://purl.obolibrary.org/obo/SO_0001775 `SO_0001776`: paternal variant. http://purl.obolibrary.org/obo/SO_0001776 `SO_0001779`: pedigree specific variant. http://purl.obolibrary.org/obo/SO_0001779 `SO_0001780`: population specific variant. http://purl.obolibrary.org/obo/SO_0001780 `SO_0001777`: somatic variant. http://purl.obolibrary.org/obo/SO_0001777
Used in:
Used in: ,
* Reference allele.
* Alternate allele.
* Type of variation: single nucleotide, indel or structural variation.
Used in:
Used in:
SE | (Start -> End) | s | t[p[ | piece extending to the right of p is joined after t SS | (Start -> Start) | s | t]p] | reverse comp piece extending left of p is joined after t ES | (End -> Start) | s | ]p]t | piece extending to the left of p is joined before t EE | (End -> End) | s | [p[t | reverse comp piece extending right of p is joined before t
Used in:
Used in:
Used in:
* Mendelian variants classification with ACMG terminology as defined in Richards, S. et al. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405–423. https://doi.org/10.1038/gim.2015.30. Classification for pharmacogenomic variants, variants associated to disease and somatic variants based on the ACMG recommendations and ClinVar classification (https://www.ncbi.nlm.nih.gov/clinvar/docs/clinsig/). `benign_variant` : Benign variants interpreted for Mendelian disorders `likely_benign_variant` : Likely benign variants interpreted for Mendelian disorders with a certainty of at least 90% `pathogenic_variant` : Pathogenic variants interpreted for Mendelian disorders `likely_pathogenic_variant` : Likely pathogenic variants interpreted for Mendelian disorders with a certainty of at least 90% `uncertain_significance` : Uncertain significance variants interpreted for Mendelian disorders. Variants with conflicting evidences should be classified as uncertain_significance
Used in:
* Confidence based on the Confidence Information Ontology `CIO_0000029`: high confidence level http://purl.obolibrary.org/obo/CIO_0000029 `CIO_0000031`: low confidence level http://purl.obolibrary.org/obo/CIO_0000031 `CIO_0000030`: medium confidence level http://purl.obolibrary.org/obo/CIO_0000030 `CIO_0000039`: rejected http://purl.obolibrary.org/obo/CIO_0000039
Used in:
CIO_0000031
CIO_0000030
CIO_0000029
CIO_0000039
Used in:
* The consistency of evidences for a given phenotype. This aggregates all evidences for a given phenotype and all evidences with no phenotype associated (e.g.: in silico impact prediction, population frequency). This is based on the Confidence Information Ontology terms. `CIO_0000033`: congruent, all evidences are consistent. http://purl.obolibrary.org/obo/CIO_0000033 `CIO_0000034`: conflict, there are conflicting evidences. This should correspond to a `VariantClassification` of `uncertain_significance` for mendelian disorders. http://purl.obolibrary.org/obo/CIO_0000034 `CIO_0000035`: strongly conflicting. http://purl.obolibrary.org/obo/CIO_0000035 `CIO_0000036`: weakly conflicting. http://purl.obolibrary.org/obo/CIO_0000036
Used in:
CIO_0000033
CIO_0000034
CIO_0000035
CIO_0000036
Used in:
Used in:
Used in:
* Pharmacogenomics drug response variant classification `responsive` : A variant that confers response to a treatment `resistant` : A variant that confers resistance to a treatment `toxicity` : A variant that is associated with drug-induced toxicity `indication` : A variant that is required in order for a particular drug to be prescribed `contraindication` : A variant that if present, a particular drug should not be prescribed `dosing` : A variant that results in an alteration in dosing of a particular drug in order to achieve INR, reduce toxicity or increase efficacy `increased_monitoring` : increase vigilance or increased dosage monitoring may be required for a patient with this variant to look for signs of adverse drug reactions `efficacy` : a variant that affects the efficacy of the treatment
Used in:
DEPRECATED: kept just for retrocompatibility purposes
* This is the list of ethnics in ONS16 `D`: Mixed: White and Black Caribbean `E`: Mixed: White and Black African `F`: Mixed: White and Asian `G`: Mixed: Any other mixed background `A`: White: British `B`: White: Irish `C`: White: Any other White background `L`: Asian or Asian British: Any other Asian background `M`: Black or Black British: Caribbean `N`: Black or Black British: African `H`: Asian or Asian British: Indian `J`: Asian or Asian British: Pakistani `K`: Asian or Asian British: Bangladeshi `P`: Black or Black British: Any other Black background `S`: Other Ethnic Groups: Any other ethnic group `R`: Other Ethnic Groups: Chinese `Z`: Not stated
Used in:
* An entry for an evidence
Used in:
* Source of the evidence
* The list of submissions
* The somatic information
* URL of source if any
* ID of record in the source
* The reference genome assembly
* List of allele origins
* Heritable traits associated to this evidence
* The transcript to which the evidence refers
* The variant classification
* Impact of evidence. Should be coherent with the classification of impact if provided.
* The curation confidence.
* The consistency status. This is applicable to complex evidences (e.g.: ClinVar)
* Ethnicity
* The penetrance of the phenotype for this genotype. Value in the range [0, 1]
* Variable expressivity of a given phenotype for the same genotype
* Evidence description
* A list of additional properties in the form name-value.
* Bibliography
* Evidence of pathogenicity and benign impact as defined in Richards, S. et al. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405–423. https://doi.org/10.1038/gim.2015.30 Evidence of pathogenicity: `very_strong`: - PVS1 null variant (nonsense, frameshift, canonical ±1 or 2 splice sites, initiation codon, single or multiexon deletion) in a gene where LOF is a known mechanism of disease `strong`: - PS1 Same amino acid change as a previously established pathogenic variant regardless of nucleotide change - PS2 De novo (both maternity and paternity confirmed) in a patient with the disease and no family history - PS3 Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product - PS4 The prevalence of the variant in affected individuals is significantly increased compared with the prevalence in controls `moderate`: - PM1 Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation - PM2 Absent from controls (or at extremely low frequency if recessive) in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium - PM3 For recessive disorders, detected in trans with a pathogenic variant - PM4 Protein length changes as a result of in-frame deletions/insertions in a nonrepeat region or stop-loss variants - PM5 Novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been seen before - PM6 Assumed de novo, but without confirmation of paternity and maternity `supporting`: - PP1 Cosegregation with disease in multiple affected family members in a gene definitively known to cause the disease - PP2 Missense variant in a gene that has a low rate of benign missense variation and in which missense variants are a common mechanism of disease - PP3 Multiple lines of computational evidence support a deleterious effect on the gene or gene product (conservation, evolutionary, splicing impact, etc.) - PP4 Patient’s phenotype or family history is highly specific for a disease with a single genetic etiology - PP5 Reputable source recently reports variant as pathogenic, but the evidence is not available to the laboratory to perform an independent evaluation Evidence of benign impact: `stand_alone`: - BA1 Allele frequency is >5% in Exome Sequencing Project, 1000 Genomes Project, or Exome Aggregation Consortium `strong`: - BS1 Allele frequency is greater than expected for disorder - BS2 Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age - BS3 Well-established in vitro or in vivo functional studies show no damaging effect on protein function or splicing - BS4 Lack of segregation in affected members of a family `supporting`: - BP1 Missense variant in a gene for which primarily truncating variants are known to cause disease - BP2 Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder or observed in cis with a pathogenic variant in any inheritance pattern - BP3 In-frame deletions/insertions in a repetitive region without a known function - BP4 Multiple lines of computational evidence suggest no impact on gene or gene product (conservation, evolutionary, splicing impact, etc.) - BP5 Variant found in a case with an alternate molecular basis for disease - BP6 Reputable source recently reports variant as benign, but the evidence is not available to the laboratory to perform an independent evaluation - BP7 A synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved
Used in:
* The source of an evidence.
Used in:
* Name of source
* Version of source
* The source date.
* The submission information
Used in:
* The submitter
* The submission date
* The submission id
Used in:
Used in:
Used in: ,
Used in:
* The feature types
Used in:
Used in:
Possible filter compositions. Delimited by ';' Where the first filter is the default one
Possible formats compositions. Delimited by ':' Where the first format is the default one
Possible genotypes seen on the slice. Where the first GT is the default one
Used in:
Used in:
Used in:
Used in: ,
Used in:
Used in:
Used in:
Used in:
Used in: ,
* The genomic feature
Used in:
* Feature Type
* Feature used, this should be a feature ID from Ensembl, (i.e, ENST00000544455)
* Others IDs. Fields like the HGNC symbol if available should be added here
Used in:
Used in:
Used in:
Used in:
Used in:
* The entity representing a phenotype and its inheritance pattern.
Used in:
* The trait (e.g.: HPO term, MIM term, DO term etc.)
* The mode of inheritance
Used in:
Used in:
Used in:
Used in:
* An enumeration for the different mode of inheritances: `monoallelic_not_imprinted`: MONOALLELIC, autosomal or pseudoautosomal, not imprinted `monoallelic_maternally_imprinted`: MONOALLELIC, autosomal or pseudoautosomal, maternally imprinted (paternal allele expressed) `monoallelic_paternally_imprinted`: MONOALLELIC, autosomal or pseudoautosomal, paternally imprinted (maternal allele expressed) `monoallelic`: MONOALLELIC, autosomal or pseudoautosomal, imprinted status unknown `biallelic`: BIALLELIC, autosomal or pseudoautosomal `monoallelic_and_biallelic`: BOTH monoallelic and biallelic, autosomal or pseudoautosomal `monoallelic_and_more_severe_biallelic`: BOTH monoallelic and biallelic, autosomal or pseudoautosomal (but BIALLELIC mutations cause a more SEVERE disease form), autosomal or pseudoautosomal `xlinked_biallelic`: X-LINKED: hemizygous mutation in males, biallelic mutations in females `xlinked_monoallelic`: X linked: hemizygous mutation in males, monoallelic mutations in females may cause disease (may be less severe, later onset than males) `mitochondrial`: MITOCHONDRIAL `unknown`: Unknown `NA`: Not applicable
Used in:
Used in: ,
Used in:
* Original variant ID before normalization including all secondary alternates.
* Alternate allele index of the original multi-allellic variant call in which was decomposed.
* Penetrance assumed in the analysis
Used in:
Used in:
Used in:
Used in:
Used in:
Used in:
Used in:
* A property in the form of name-value pair. Names are restricted to ontology ids, they should be checked against existing ontologies in resources like Ontology Lookup Service.
Used in:
* The ontology term id or accession in OBO format ${ONTOLOGY_ID}:${TERM_ID} (http://www.obofoundry.org/id-policy.html)
* The ontology term name
* Optional value for the ontology term, the type of the value is not checked (i.e.: we could set the pvalue term to "significant" or to "0.0001")
Used in:
Used in:
Ensembl or RefSeq protein ID
Used in:
Used in:
Used in: , , , ,
Used in: ,
Used in: ,
Used in:
* The somatic information.
Used in:
* The primary site
* The primary site subtype
* The primary histology
* The histology subtype
* The tumour origin
* The sample source, e.g. blood-bone marrow, cell-line, pancreatic
Used in:
Used in:
Used in:
* Number of copies for CNV variants.
* Inserted sequence for long INS
* Structural variation type: COPY_NUMBER_GAIN, COPY_NUMBER_LOSS, TANDEM_DUPLICATION, ...
* Type of structural variation <ul> <li>COPY_NUMBER_GAIN for CNVs</li> <li>COPY_NUMBER_LOSS for CNVs</li> <li>TANDEM_DUPLICATION for DUP</li> </ul>
Used in:
unused = 0; // SO:0001742
SO:0001742
SO:0001743
SO:1000173
Used in:
* Alternate alleles that appear along with a variant alternate.
* Association of variants to a given trait. `established_risk_allele` : Established risk allele for variants associated to disease `likely_risk_allele` : Likely risk allele for variants associated to disease `uncertain_risk_allele` : Uncertain risk allele for variants associated to disease `protective` : Protective allele
Used in:
Used in:
Used in:
* Variant classification according to its relation to cancer aetiology. `driver` : Driver variants `passenger` : Passenger variants `modifier` : Modifier variants
Used in:
* Chromosome where the genomic variation occurred.
* Normalized position where the genomic variation starts. <ul> <li>SNVs have the same start and end position</li> <li>Insertions start in the last present position: if the first nucleotide is inserted in position 6, the start is position 5</li> <li>Deletions start in the first previously present position: if the first deleted nucleotide is in position 6, the start is position 6</li> </ul>
* Normalized position where the genomic variation ends. <ul> <li>SNVs have the same start and end positions</li> <li>Insertions end in the first present position: if the last nucleotide is inserted in position 9, the end is position 10</li> <li>Deletions ends in the last previously present position: if the last deleted nucleotide is in position 9, the end is position 9</li> </ul>
* Reference allele.
* Alternate allele.
* Reference strand for this variant
* Information regarding Structural Variants
* The variant ID.
* Other names used for this genomic variation.
* Length of the genomic variation, which depends on the variation type. <ul> <li>SNVs have a length of 1 nucleotide</li> <li>Indels have the length of the largest allele</li> </ul>
* Type of variation: single nucleotide, indel or structural variation.
* Information specific to each study the variant was read from, such as samples or statistics.
* Annotations of the genomic variation.
Used in:
Used in:
Used in:
* The variant classification according to different properties.
Used in:
* The variant's clinical significance.
* The variant's pharmacogenomics classification.
* The variant's trait association.
* The variant's tumorigenesis classification.
* The variant functional effect
* Variant effect with Sequence Ontology terms. `SO_0002052`: dominant_negative_variant (http://purl.obolibrary.org/obo/SO_0002052) `SO_0002053`: gain_of_function_variant (http://purl.obolibrary.org/obo/SO_0002053) `SO_0001773`: lethal_variant (http://purl.obolibrary.org/obo/SO_0001773) `SO_0002054`: loss_of_function_variant (http://purl.obolibrary.org/obo/SO_0002054) `SO_0001786`: loss_of_heterozygosity (http://purl.obolibrary.org/obo/SO_0001786) `SO_0002055`: null_variant (http://purl.obolibrary.org/obo/SO_0002055)
Used in:
Used in:
* Variant score ID.
* Main cohort used for calculating the score.
* Optional secondary cohort used for calculating the score.
* Score value
* Score p value
Used in:
* Unique cohort identifier within the study.
* Count of samples with non-missing genotypes in this variant from the cohort. This value is used as denominator for genotypeFreq.
* Count of files with samples from the cohort that reported this variant. This value is used as denominator for filterFreq.
* Total number of alleles in called genotypes. It does not include missing alleles. This value is used as denominator for refAlleleFreq and altAlleleFreq.
* Number of reference alleles found in this variant.
* Number of main alternate alleles found in this variants. It does not include secondary alternates.
* Reference allele frequency calculated from refAlleleCount and alleleCount, in the range [0,1]
* Alternate allele frequency calculated from altAlleleCount and alleleCount, in the range [0,1]
* Number of missing alleles
* Number of genotypes with all alleles missing (e.g. ./.). It does not count partially missing genotypes like "./0" or "./1".
* Number of occurrences for each genotype. This does not include genotype with all alleles missing (e.g. ./.), but it includes partially missing genotypes like "./0" or "./1". Total sum of counts should be equal to the count of samples.
* Genotype frequency for each genotype found calculated from the genotypeCount and samplesCount, in the range [0,1] The sum of frequencies should be 1.
* The number of occurrences for each FILTER value in files from samples in this cohort reporting this variant. As each file can contain more than one filter value (usually separated by ';'), the total sum of counts could be greater than to the count of files.
* Frequency of each filter calculated from the filterCount and filesCount, in the range [0,1]
* The number of files from samples in this cohort reporting this variant with valid QUAL values. This value is used as denominator to obtain the qualityAvg.
* The average Quality value for files with valid QUAL values from samples in this cohort reporting this variant. Some files may not have defined the QUAL value, so the sampling could be less than the filesCount.
* Minor allele frequency. Frequency of the less common allele between the reference and the main alternate alleles. This value does not take into acconunt secondary alternates.
* Minor genotype frequency. Frequency of the less common genotype seen in this variant. This value takes into account all values from the genotypeFreq map.
* Allele with minor frequency
* Genotype with minor frequency
* Type of variation, which depends mostly on its length. <ul> <li>SNVs involve a single nucleotide, without changes in length</li> <li>MNVs involve multiple nucleotides, without changes in length</li> <li>Indels are insertions or deletions of less than SV_THRESHOLD (50) nucleotides</li> <li>Structural variations are large changes of more than SV_THRESHOLD nucleotides</li> <li>Copy-number variations alter the number of copies of a region</li> </ul>
Used in: , ,
As the NO_VARIATION is the most common value on gVCFs, being the first value, protobuf will use this as default value and save some space.
Defined in HTSJDK
SO:0001483
SO:0002007
SO:1000032
SO:0001537
SO:0001019
SO:0001742
SO:0001743
Defined in HTSJDK
Defined in HTSJDK
SO:0000667
SO:0000159
SO:0000199
SO:1000036
SO:1000035
SO:1000173
Deprecated
Deprecated. Renamed to COPY_NUMBER
Deprecated
Deprecated
Used in:
1 based May contain negative values but it's not likely
May contain negative values but it's not likely
Used in:
GT is mandatory. Saving it separately can create a map of genotypes in Fields
List of records (lines)
Used in: