hap_collapse()
collapses haplotypes from a DNAbin object,
keeping unique haplotypes only.
Value
A DNAbin object.
Details
hap_collapse()
collapses a DNAbin object, keeping unique
haplotypes only. If clean = TRUE
, the function will call clean_dna to remove
any non ACTG bases from alignment prior to collapsing haplotypes. If clean = FALSE
,
the function will treat data as it is, and will not remove any bases. If
collapseSubstrings = TRUE
, the function will consider shorter but identical
sequences as the same haplotype and collapse them, returning the longest
sequence. If collapseSubstrings = FALSE
, the function will consider
shorter but identical sequences as different haplotypes and will keep them.
Examples
# collapse into unique haplotypes, including shorter sequences
hap_collapse(geophagus, clean = TRUE, collapseSubstrings = TRUE)
#> Warning: ⚠ You have missing data "('N','-' '?')" or ambiguity inside your sequence, i.e.
#> not padding the ends, and this may have unintended consequences later, as they
#> have now been removed!
#> ℹ The names of the samples are bellow.
#> GU701784.1, GU701785.1
#> 137 DNA sequences in binary format stored in a list.
#>
#> Mean sequence length: 643.007
#> Shortest sequence: 505
#> Longest sequence: 690
#>
#> Labels:
#> MZ504301.1
#> MZ504318.1
#> MZ504341.1
#> MZ504337.1
#> MZ504342.1
#> MZ504304.1
#> ...
#>
#> Base composition:
#> a c g t
#> 0.238 0.281 0.177 0.304
#> (Total: 88.09 kb)
# collapse into unique haplotypes keeping shorter sequences
hap_collapse(geophagus, clean = TRUE, collapseSubstrings = FALSE)
#> Warning: ⚠ You have missing data "('N','-' '?')" or ambiguity inside your sequence, i.e.
#> not padding the ends, and this may have unintended consequences later, as they
#> have now been removed!
#> ℹ The names of the samples are bellow.
#> GU701784.1, GU701785.1
#> 246 DNA sequences in binary format stored in a list.
#>
#> Mean sequence length: 639.764
#> Shortest sequence: 505
#> Longest sequence: 690
#>
#> Labels:
#> MZ504301.1
#> MZ504328.1
#> MZ504318.1
#> MZ504341.1
#> MZ504337.1
#> MZ504299.1
#> ...
#>
#> Base composition:
#> a c g t
#> 0.237 0.282 0.177 0.303
#> (Total: 157.38 kb)