Home
Isabl is a platform for the integration, management, and processing of individual-centric multimodal data. Welcome to the Isabl Documentation!
Isabl is a plug-and-play data science framework designed to support the processing of multimodal patient-centric data. Have questions? Ask here.
Isabl has been developed by the Elli Papaemmanuil's Lab.
Quick StartFeatures
👾 Backend, Data Model and RESTful API
Metadata version control
Fully featured and brisk RESTful API with extensive swagger documentation
Comprehensive permissions controls and user groups
Patient centric relational model with support for:
Individuals, samples, experiments and cohorts
Assembly aware bioinformatics applications and analyses
Choice models such as diseases, centers and more
Custom fields for all schemas!
🤖 Command Line Interface and Software Development Kit
Digital Assets Management (Permissions, Storage, Tracking)
Automated execution and tracking of bioinformatics applications
Project and patient level results auto-merge
Operational automations on data import and analyses status change
Dynamic retrieval of data and results using versatile queries
Fully featured SDK for post-processing analyses
🚀 Web Application
User Interface to browse and manage the operations metadata
Analyses tracking and results visualization
Flexibility to edit and customize models
Batch creation of metadata by excel file submission
Single Page Application that provides a crispy user experience
Possibility to integrate third-party services like JIRA
✅ Plug-n-play and reliable codebase
Docker-compose is the only dependency for the web application and the backend
The Command Line Interface is a portable pip installable package
Continuously Integrated with +98 % coverage across all codebase
isabl is upgradable, no need to fork out from codebase
Who is using Isabl
The Department of Pediatrics at Memorial Sloan Kettering.
The Microbiome Program at Memorial Sloan Kettering.
The Single-Cell Analytics Service (SAIL) at Memorial Sloan Kettering.
Cristina Curtis' Lab at Stanford Medicine.
... And many other groups at Weill Cornell, California State University, University of Oviedo (Spain), are currently testing it as a potential fit!
Infrastructure
Isabl is a modular infrastructure with four main components: (1) an individual-centric and extensible relational database (Isabl-db); (2) a comprehensive RESTful API (Isabl-api) used to support integration with data processing environments and enterprise systems (e.g. clinical databases, visualization platforms); (3) a Command Line Client (CLI; Isabl-cli) used to manage digital assets and deploy bioinformatics applications; (4) a front end single page web application (Isabl-web) with system wide queries enabled.
RESTful API capabilities are documented with Swagger (https://swagger.io) and Redoc (https://github.com/Rebilly/ReDoc) following OpenAPI specifications (https://www.openapis.org). Importantly, Isabl's metadata infrastructure is decoupled and agnostic of compute and data storage environments (e.g. local, cluster, cloud). This functionality separates dependencies and fosters interoperability across compute environments.
Data Model
Isabl's relational model maps workflows for data provenance, processing, and governance. Metadata is captured across the following thematic categories: (1) project, individual and sample level attributes; (2) raw data properties including experimental technique, technology, and related parameters (e.g. read length); (3) analytical workflows to include a complete audit trail of versioned algorithms, related execution parameters, reference files, analyses status tracking, and results deposition; (4) data governance information for management of system and data access across stakeholders.
Why Isabl
Isabl ensures that all bioinformatics operations follow the DATA reproducibility checklist (Documentation, Automation, Traceability, and Autonomy), whilst guarantees that assets are managed according to the FAIR principles (Findable, Interoperable, Accessible, Reusable).
Here are some reasons why you may want to use Isabl:
You don't have a +10 engineers group but do have hundreds of samples
You'll rather not have your data managed by postdocs, PhD students
Crosslink samples from different cohorts
Answer new questions using existing data
Full log and audit trail of your informatics operations
Automatically merge results as new samples are added to big cohorts
You want to have programmatic access to the entire data capital
Seamlessly run reproducible pipelines across your projects
Similar projects
The Genome Modeling System Genome Institute at Washington University platform.
SeqWare analyze massive genomics datasets.
QuickNGS efficient high-throughput data analysis of Next-Generation Sequencing data.
HTS-flow a framework for the management and analysis of NGS data.
What Isabl is not
Isabl is not a Platform as a Service (PAAS) provider such as DNA nexus, Seven Bridges or Fire Cloud, instead an information system that could potentially feed in metadata and data to these services.
Isabl differs from Server Workbenches such as Galaxy or Pegasus, instead of being configuration friendly, Isabl is designed to conduct systematic analyses automatically and in a standardized way with as little human input as possible.
Isabl is not a Workflow Language, instead the Bioinformatics Applications in
isabl
only define meta-data driven validation and logic to build commands to trigger pipelines written in any language.
Last updated