Back to Browse

Population Genomics is a Data Management Problem (TileDB Webinar)

1.1K views
Nov 18, 2021
1:49:17

For a summary see the blog post at https://tiledb.com/blog/population-genomics-is-a-data-management-problem-2021-11-17. We were honored to feature Dr. Stephen Kingsmore, President and CEO of Rady Children's Institute for Genomic Medicine, as a guest speaker in our webinar. Special thanks to Helix (helix.com) for their multi-year partnership and immense contributions to our TileDB-VCF offering. *About the presentation* Population genomics is an important and challenging problem plagued by non-scalable domain-specific formats, which make it difficult to efficiently store, access, share and analyze massive amounts of variant-call data at the scale required for gaining meaningful insights. TileDB addresses this challenge with a universal database that stores variant-call data as multi-dimensional arrays that can be updated, governed and analyzed at unprecedented scale and low cost. In this comprehensive presentation of TileDB’s population genomics solution, TileDB-VCF, you will learn how to: • Model genomic variants as a 3D sparse array • Efficiently update variant datasets, solving the N+1 problem • Ingest huge collections of VCF samples in parallel on TileDB Cloud • Export to VCF for full compatibility with existing tools • Share access to TB of variant datasets avoiding file downloads • Implement scalable genome-wide analyses using serverless compute • Enable reproducible science and collaboration through code and data sharing *Contents of this video* 0:00:00 – Introduction 0:03:39 – The problem in population genomics 0:19:31 – A solution template 0:22:36 – The solution with TileDB 0:32:59 – Dr. Stephen Kingsmore, Rady Children's Institute for Genomic Medicine 0:44:31 – TileDB-VCF basics 1:01:12 – Building scalable genome-wide analyses on TileDB Cloud 1:09:13 – Integrative clinical genomic workflows 1:17:30 – Work in progress 1:21:13 – Q&A *Reproduce the examples on TileDB Cloud* • 1000 Genomes Project High-Coverage Variant Calls data set: https://cloud.tiledb.com/arrays/details/TileDB-Inc/vcf-1kg-nygc-data/overview • TileDB-VCF basics: https://cloud.tiledb.com/notebooks/details/TileDB-Inc/tutorial_tiledbvcf_basics/preview • Building scalable genome-wide analyses on TileDB Cloud: https://cloud.tiledb.com/notebooks/details/TileDB-Inc/tutorial_tiledbvcf_gwas/preview • Integrative clinical genomic workflows: https://cloud.tiledb.com/notebooks/details/TileDB-Inc/Genomics-Workflow-Example/preview *About* TileDB makes data management and compute fast, easy and universal. Manage any data as multi-dimensional arrays and access with any tool at global scale. *Connect with us* Website: https://tiledb.com/ Twitter: https://twitter.com/tiledb LinkedIn: https://www.linkedin.com/company/tiledb-inc/ Book a personalized product demo: https://tiledb.com/demo Sign up at https://cloud.tiledb.com/auth/signup and contact [email protected] for free credits.

Download

0 formats

No download links available.

Population Genomics is a Data Management Problem (TileDB Webinar) | NatokHD