Dalliance
Thomas Down
http://www.biodalliance.org/
Genome browsers
- Can we have the best of both worlds?
[any material that should appear in print but not on the slide]
Technologies align
- Solid Javascript implementations
- Rich browser-based graphics (SVG/Canvas)
- Browser vendors focussing on performance (games!)
- DAS: Distributed Annotation System
- CORS: Cross-origin Resource Sharing
- HTML5
[any material that should appear in print but not on the slide]
Let's try it out
[any material that should appear in print but not on the slide]
What is DAS
- The distributed annotation system
- Simple HTTP queries to retrieve sequence annotation
GET http://server/das/xyz/features
?segment=22:30000000,40000000
<FEATURE id="...">
<START>30000000</START>
<END>30000001</END>
<TYPE id="density">read depth</TYPE>
</FEATURE>
Plus stylesheets, registry services, etc.
[any material that should appear in print but not on the slide]
What is CORS?
[any material that should appear in print but not on the slide]
More data...
- Data volumes increasing
- More datasets
- Larger datasets
- Experiments like:
- ChIP-seq.
- Exome resequencing.
- ...
- Small labs produce genome-wide data
- Few resources to help get data out
[any material that should appear in print but not on the slide]
How does this fit DAS?
- DAS XML has served us well
- Relatively rich data model (IDs, links, notes)
- ...but XML is verbose
<FEATURE id="why_oh_why?">
<START>30000000</START>
<END>30000001</END>
<TYPE id="density">read depth</TYPE>
...and more...
</FEATURE>
Requires a server deployment
...and someone to deploy that server
Relational DB backends don't always scale
[any material that should appear in print but not on the slide]
The binary solution
- Dense binary formats
bigbed
: general features
bigwig
: dense quantitative data
BAM
: short read alignments
- All have indices
- Therefore, random access is possible
- HTTP supports random access using
Range
.htaccess
file:
Header set Access-Control-Allow-Origin "*"
Header set Access-Control-Allow-Headers "Range"
[any material that should appear in print but not on the slide]
Binary Dalliance
- Recent browsers have bearably-good APIs for binary data
- We can use these to directly support bigwig and bigbed and BAM files
- ...demo...
[any material that should appear in print but not on the slide]
Dalliance in the future
- More data!
- More formats? (please ask!)
- More interactivity
- Jump around quickly between interesting regions.
- Link to other tools.
- Connect with other biologists.
- Easier setup: complete browser from static files
- Alignments (assembly mapping; comparative genomics)
- Performance (SVG -> Canvas?)
[any material that should appear in print but not on the slide]
Acknowledgements
- Tim Hubbard
- DAS Developers
- Dalliance testers
- My other life: Wellcome Trust RCDF
http://www.biodalliance.org/
[any material that should appear in print but not on the slide]