{ "cells": [ { "cell_type": "markdown", "metadata": { "nbsphinx": "hidden" }, "source": [ "# Vitessce Data Preparation Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Export data to local files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Import dependencies\n", "\n", "We need to import the classes and functions that we will be using from the corresponding packages." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import json\n", "from urllib.parse import quote_plus\n", "from os.path import join, isfile, isdir\n", "from urllib.request import urlretrieve\n", "from anndata import read_h5ad\n", "import scanpy as sc\n", "\n", "from vitessce import (\n", " VitessceWidget,\n", " VitessceConfig,\n", " Component as cm,\n", " CoordinationType as ct,\n", " AnnDataWrapper,\n", ")\n", "from vitessce.data_utils import (\n", " optimize_adata,\n", " VAR_CHUNK_SIZE,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Download and process data\n", "\n", "For this example, we need to download a dataset from the COVID-19 Cell Atlas https://www.covid19cellatlas.org/index.healthy.html#habib17." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "adata_filepath = join(\"data\", \"habib17.processed.h5ad\")\n", "if not isfile(adata_filepath):\n", " os.makedirs(\"data\", exist_ok=True)\n", " urlretrieve('https://covid19.cog.sanger.ac.uk/habib17.processed.h5ad', adata_filepath)\n", "\n", "adata = read_h5ad(adata_filepath)\n", "top_dispersion = adata.var[\"dispersions_norm\"][\n", " sorted(\n", " range(len(adata.var[\"dispersions_norm\"])),\n", " key=lambda k: adata.var[\"dispersions_norm\"][k],\n", " )[-51:][0]\n", "]\n", "adata.var[\"top_highly_variable\"] = (\n", " adata.var[\"dispersions_norm\"] > top_dispersion\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "zarr_filepath = join(\"data\", \"habib17.processed.zarr\")\n", "if not isdir(zarr_filepath):\n", " adata = optimize_adata(\n", " adata,\n", " obs_cols=[\"CellType\"],\n", " obsm_keys=[\"X_umap\"],\n", " var_cols=[\"top_highly_variable\"],\n", " optimize_X=True,\n", " )\n", " adata.write_zarr(zarr_filepath, chunks=[adata.shape[0], VAR_CHUNK_SIZE])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Create the Vitessce configuration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up the configuration by adding the views and datasets of interest." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vc = VitessceConfig(schema_version=\"1.0.15\", name='Habib et al', description='COVID-19 Healthy Donor Brain')\n", "dataset = vc.add_dataset(name='Brain').add_object(AnnDataWrapper(\n", " adata_path=zarr_filepath,\n", " obs_embedding_paths=[\"obsm/X_umap\"],\n", " obs_embedding_names=[\"UMAP\"],\n", " obs_set_paths=[\"obs/CellType\"],\n", " obs_set_names=[\"Cell Type\"],\n", " obs_feature_matrix_path=\"X\",\n", " feature_filter_path=\"var/top_highly_variable\"\n", "))\n", "scatterplot = vc.add_view(cm.SCATTERPLOT, dataset=dataset, mapping=\"X_umap\")\n", "cell_sets = vc.add_view(cm.OBS_SETS, dataset=dataset)\n", "genes = vc.add_view(cm.FEATURE_LIST, dataset=dataset)\n", "heatmap = vc.add_view(cm.HEATMAP, dataset=dataset)\n", "vc.layout((scatterplot | (cell_sets / genes)) / heatmap);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Export files to a local directory\n", "\n", "The `.export(to='files')` method on the view config instance will export files to the specified directory `out_dir`. The `base_url` parameter is required so that the file URLs in the view config point to the location where you ultimately intend to serve the files." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config_dict = vc.export(to='files', base_url='http://localhost:3000', out_dir='./test')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Serve the files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the files have been saved to the `./test` directory, they can be served by any static web server.\n", "\n", "If you would like to serve the files locally, we recommend [http-server](https://github.com/http-party/http-server) which can be installed with NPM or Homebrew:\n", "```sh\n", "cd test\n", "http-server ./ --cors -p 3000\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. View on vitessce.io\n", "\n", "The returned view config dict can be converted to a URL, and if the files are served on the internet (rather than locally), this URL can be used to share the interactive visualizations with colleagues." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vitessce_url = \"http://vitessce.io/?url=data:,\" + quote_plus(json.dumps(config_dict))\n", "import webbrowser\n", "webbrowser.open(vitessce_url)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.0" } }, "nbformat": 4, "nbformat_minor": 4 }