Training Models on Atlas-Scale Single Cell Datasets: Joint TileDB-CZI Workshop at scverse 2024
Blog post from TileDB
TileDB, a database designed for scientific discovery, and the Chan Zuckerberg Initiative (CZI) are set to host a workshop at the scverse Conference in Munich on September 12, 2024, focusing on training models using large single-cell RNA sequencing datasets. The workshop, led by Ryan Williams from TileDB and Maximillan Lombardo from CZI, will provide attendees with hands-on experience in working with CZI's CELLxGENE Discover Census, a dataset containing 70 million cells. Participants will learn about TileDB's capabilities in efficiently managing large datasets through its open-source data format and storage engine, as well as SOMA's language-agnostic data model for single-cell data. The workshop also includes discussions on specialized PyTorch loaders optimized for memory-efficient training. TileDB is noted for its ability to handle complex multimodal data, facilitating scientific insights, while Devika Garg, the Director of Product Marketing at TileDB, brings her extensive background in life sciences and science communication to the forefront of promoting these advanced data solutions.