Working with haplotypes

Calling variants

The first step in working with sequence variation is to identify (call) variations between two sequences. SequenceVariation can directly call variants using the Haplotype(::PairwiseAlignment) constructor of the Haplotype type.

julia> using SequenceVariation, BioAlignments, BioSequences
julia> bovine = dna"GACCGGCTGCATTCGAGGCTGCCAGCAAGCAG";
julia> ovine = dna"GACCGGCTGCATTCGAGGCTGTCAGCAAACAG";
julia> human = dna"GACAGGCTGCATCAGAAGAGGCCATCAAGCAG";
julia> bos_ovis_alignment = PairwiseAlignment(AlignedSequence(ovine, Alignment("32M", 1, 1)), bovine);
julia> bos_human_alignment = PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), bovine);
julia> bos_ovis_haplotype = Haplotype(bos_ovis_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29A
julia> bos_human_haplotype = Haplotype(bos_human_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 7 edits: C4A T13C C14A G17A C19A T20G G25T

Sequence reconstruction

If the alternate sequence of a haplotype is no longer available (as is often the case when calling variants from alignment files), then the sequence can be retrieved using the reconstruct function.

julia> human2 = reconstruct(bos_human_haplotype)32nt DNA Sequence:
GACAGGCTGCATCAGAAGAGGCCATCAAGCAG
julia> human2 == bovinefalse
julia> human2 == humantrue

Reference switching

All variations within a haplotype can be mapped to a new reference sequence given an alignment between the new and old references using the translate function. This could be useful if variants were called against a reference sequence for the entire species, but need to be analyzed as variants of a subtype later.

julia> ovis_human_alignment =
           PairwiseAlignment(AlignedSequence(human, Alignment("32M", 1, 1)), ovine)BioAlignments.PairwiseAlignment{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}}:
  seq:  1 GACAGGCTGCATCAGAAGAGGCCATCAAGCAG 32
          ||| ||||||||  || |  | || ||| |||
  ref:  1 GACCGGCTGCATTCGAGGCTGTCAGCAAACAG 32
julia> SequenceVariation.translate(bos_ovis_haplotype, ovis_human_alignment)Haplotype{BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, BioSymbols.DNA} with 2 edits: C22T G29A