Indexing

Base.getindex is defined for ReadDatastores:

julia> ds = open(PairedReads{DNAAlphabet{2}}, "ecoli-test-paired.prseq", "my-ecoli-pe")
Paired Read Datastore 'my-ecoli-pe': 20 reads (10 pairs)

julia> ds[5]
300nt DNA Sequence:
ACATGCACTTCAACGGCATTACTGGTGACCTCTTCGTCC…TCTATCAACGCAAAAGGGTTACACAGATAATCGTCAGCT

Indexing a read datastore creates a new sequence. If you want to load a sequence from a datastore and into an existing sequence, then you can use the load_sequence! method.

julia> seq = LongSequence{DNAAlphabet{2}}()
0nt DNA Sequence:
< EMPTY SEQUENCE >

julia> load_sequence!(ds, 6, seq)
300nt DNA Sequence:
ATTACTGCGATTACTGCTGCGAATTTTTTCATGTTTATT…GTCCACTGGTTTACACAAGGTCGTAAGGGAAAAGAGGCG

julia> seq
300nt DNA Sequence:
ATTACTGCGATTACTGCTGCGAATTTTTTCATGTTTATT…GTCCACTGGTTTACACAAGGTCGTAAGGGAAAAGAGGCG

Iteration

The ReadDatastore types also support the Base.iterate interface:

julia> collect(ds)
20-element Array{LongSequence{DNAAlphabet{2}},1}:
 GGGCTTTAAAATCCACTTTTTCCATATCGATAGTCACGT…ATTTCTTCGATTCTTCTTTGTCACCGCAGCCAGCAAGAG
 GTGGGTTTTTATCGGCTGGCACATGTGTTGGGACAATTT…GGCTTTCAATACGCTGTTTTCCCTCGTTGTTTCATCTGT
 TGAACTCCACATCCTGCGGATCGTAAACCGTCACCTCTT…TCTTCCAGGCAGGCCGCCAGGGTATCACCTTCCAGACCA
 GATGAATCTGGCGGTTATTAACGGTAACAATAACCAGCA…AGACGGCAAACCGGCTGCAGGCGGTAGGTTGTTGCAGGT
 ACATGCACTTCAACGGCATTACTGGTGACCTCTTCGTCC…TCTATCAACGCAAAAGGGTTACACAGATAATCGTCAGCT
 ATTACTGCGATTACTGCTGCGAATTTTTTCATGTTTATT…GTCCACTGGTTTACACAAGGTCGTAAGGGAAAAGAGGCG
 CGGTTGAGTTCAAAGGCAAAGATTTGCTTGCGCTGTCGC…TTTTCCGGCGGCGAGAAAAAGCGCAACGATTTTTTGCAA
 TTCGTCCCTGATATAGCACATGAACGTAATCAGGCTTGA…AATCTTCCGGCATCTTCAGGAGAGCGATTTTCTCTTCCA
 ACGACACATTACCGGAAATTCAGGCCGACCCGGACAGGC…TTGAACAACACGGTGGTACAATTCAGGTCGCAAGCCAGG
 TCCACCACCAGAATATCGATATTATCGTGCGTCATCCTT…TCACGCCCGCGCCGCTTTCGCTGGCCGTCACGCTAATCA
 CGTAACTTTATTCATATCTCTTCCCCCTCCCTGTACTTC…CTGTTACCGCATGGCGGCAGTGCGCTGGTCGATATGACC
 ATCGGGTAGGGGACGGAACGAATACGACAGTCAATATTC…AAGACTTTATCGTGCGGTCCGAACCGACTTTGTGGCGGC
 GCCCTGGAGCTGGTGAAAGAAGGTCGAGCGCAAGCCTGT…CAATCCTCGCGTGGCGTTGCTCAATATTGGTGAAGAAGA
 GAAAGGAACATCCTGACAACACCTTCCATCGTCTTTAAT…ATAAAGGCAAATTGCACCACCATGATGCTGTCCCAATCA
 GTCTGGTGGTGCCTCTTTACTTAAGGAATTTCATCCTGT…AACGATGCCAGGCACCTGCGAAACTTTCCTGCACCAGCC
 GACCGTTTTTCCCCAATCCGAGAACGCATAAATCCAGAC…TTTCTTCCCGGTAATGATACGTCACTATTGGAGTGGCCC
 AGAGGCCACAGCGCGCCCATAATGGCGACTGAAAGCCAG…TTCACCGCGGTGACCGGAATCAGGGCAAATTCGACATGT
 AAAAGGATCGCCGACCTTAACCATTCTGAATGTGATTGG…CTGGTGCCTGTCATATTTCGAACTCTGGGGGGACAGCAT
 TGAGCAAATATGCCCGACCCAGCCTCATGACAGCGATAT…ACCGAAAAAAAAGTAATCGTCGGCATGTCCGGCGGTGTC
 AGGCTTTAAATTTGATCTCTTTGTTGCACAGAATATCCG…GCCAGGAAGAAACGGAGGAACCGACACCGCCGGCCATGC

Buffers

When iterating over a ReadDatastore either using Base.getindex or load_sequence!, you can sacrifice some memory for a buffer, to reduce the number of times the hard disk is read, and speed up sequential access and iteration. You can use the buffer method to wrap a datastore in such a buffer.

julia> bds = buffer(ds)
Buffered Paired Read Datastore 'my-ecoli-pe': 20 reads (10 pairs)

julia> for i in eachindex(bds)
           load_sequence!(bds, i, seq)
           println(length(seq))
       end
297
300
299
300
300
300
299
300
300
300
300
300
299
300
300
300
300
300
300
300