Data preparation APIs
Dataset wrapper classes provide functionality for adding in-memory or local data objects to datasets when rendering Vitessce as a Jupyter widget.
We provide default wrapper class implementations for data formats used by popular single-cell and imaging packages.
To write your own custom wrapper class, create a subclass
of the AbstractWrapper
class, implementing the
getter functions for the data types that can be derived from your object.
vitessce.wrappers
- class vitessce.wrappers.AbstractWrapper(**kwargs)[source]
An abstract class that can be extended when implementing custom dataset object wrapper classes.
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- auto_view_config(vc)[source]
Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.
- Parameters
vc (VitessceConfig) – The view config instance.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- get_local_dir_route(dataset_uid, obj_i, local_dir_path, local_dir_uid)[source]
Obtain the Mount for some local directory
- Parameters
dataset_uid (str) – A dataset unique identifier for the Mount
obj_i (str) – A index of the current vitessce.wrappers.AbstractWrapper among all other wrappers in the view config
local_dir_path (str) – The path to the local directory to serve.
local_dir_uid (str) – The UID to include as the route path suffix.
- Returns
A starlette Mount of the the local_dir_path
- Return type
list[starlette.routing.Mount]
- class vitessce.wrappers.AnnDataWrapper(adata_path=None, adata_url=None, obs_feature_matrix_path=None, feature_filter_path=None, initial_feature_filter_path=None, obs_set_paths=None, obs_set_names=None, obs_locations_path=None, obs_segmentations_path=None, obs_embedding_paths=None, obs_embedding_names=None, obs_embedding_dims=None, request_init=None, feature_labels_path=None, obs_labels_path=None, convert_to_dense=True, coordination_values=None, obs_labels_paths=None, obs_labels_names=None, **kwargs)[source]
Wrap an AnnData object by creating an instance of the
AnnDataWrapper
class.- Parameters
adata_path (str) – A path to an AnnData object written to a Zarr store containing single-cell experiment data.
adata_url (str) – A remote url pointing to a zarr-backed AnnData store.
obs_feature_matrix_path (str) – Location of the expression (cell x gene) matrix, like X or obsm/highly_variable_genes_subset
feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.
initial_feature_filter_path (str) – A string like var/highly_variable used in conjunction with obs_feature_matrix_path if obs_feature_matrix_path points to a subset of X of the full var list.
obs_set_paths (list[str]) – Column names like [‘obs/louvain’, ‘obs/cellType’] for showing cell sets
obs_set_names (list[str]) – Names to display in place of those in obs_set_paths, like [‘Louvain’, ‘Cell Type’]
obs_locations_path (str) – Column name in obsm that contains centroid coordinates for displaying centroids in the spatial viewer
obs_segmentations_path (str) – Column name in obsm that contains polygonal coordinates for displaying outlines in the spatial viewer
obs_embedding_paths (list[str]) – Column names like [‘obsm/X_umap’, ‘obsm/X_pca’] for showing scatterplots
obs_embedding_names (list[str]) – Overriding names like [‘UMAP’, ‘PCA’] for displaying above scatterplots
obs_embedding_dims (list[str]) – Dimensions along which to get data for the scatterplot, like [[0, 1], [4, 5]] where [0, 1] is just the normal x and y but [4, 5] could be comparing the third and fourth principal components, for example.
request_init (dict) – options to be passed along with every fetch request from the browser, like { “header”: { “Authorization”: “Bearer dsfjalsdfa1431” } }
feature_labels_path (str) – The name of a column containing feature labels (e.g., alternate gene symbols), instead of the default index in var of the AnnData store.
obs_labels_path (str) – (DEPRECATED) The name of a column containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store. Use obs_labels_paths and obs_labels_names instead. This arg will be removed in a future release.
obs_labels_paths (list[str]) – The names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.
obs_labels_names (list[str]) – The optional display names of columns containing observation labels (e.g., alternate cell IDs), instead of the default index in obs of the AnnData store.
convert_to_dense (bool) – Whether or not to convert X to dense the zarr store (dense is faster but takes more disk space).
coordination_values (dict or None) – Coordination values for the file definition.
**kwargs – Keyword arguments inherited from
AbstractWrapper
- auto_view_config(vc)[source]
Auto view configuration is intended to be used internally by the VitessceConfig.from_object method. Each subclass of AbstractWrapper may implement this method which takes in a VitessceConfig instance and modifies it by adding datasets, visualization components, and view coordinations. Implementations of this method may create an opinionated view config based on inferred use cases.
- Parameters
vc (VitessceConfig) – The view config instance.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.CsvWrapper(csv_path=None, csv_url=None, data_type=None, options=None, coordination_values=None, **kwargs)[source]
Wrap a CSV file by creating an instance of the
CsvWrapper
class.- Parameters
data_type (str) – The data type of the information contained in the file.
csv_path (str) – A local filepath to a CSV file.
csv_url (str) – A remote URL of a CSV file.
options (dict) – The file options.
coordination_values (dict) – The coordination values.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.MultiImageWrapper(image_wrappers, use_physical_size_scaling=False, **kwargs)[source]
Wrap multiple imaging datasets by creating an instance of the
MultiImageWrapper
class.- Parameters
image_wrappers (list) – A list of imaging wrapper classes (only
OmeTiffWrapper
supported now)**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.MultivecZarrWrapper(zarr_path=None, zarr_url=None, **kwargs)[source]
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.OmeTiffWrapper(img_path=None, offsets_path=None, img_url=None, offsets_url=None, name='', transformation_matrix=None, is_bitmask=False, **kwargs)[source]
Wrap an OME-TIFF File by creating an instance of the
OmeTiffWrapper
class.- Parameters
img_path (str) – A local filepath to an OME-TIFF file.
offsets_path (str) – A local filepath to an offsets.json file.
img_url (str) – A remote URL of an OME-TIFF file.
offsets_url (str) – A remote URL of an offsets.json file.
name (str) – The display name for this OME-TIFF within Vitessce.
transformation_matrix (list[number]) – A column-major ordered matrix for transforming this image (see http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#homogeneous-coordinates for more information).
is_bitmask (bool) – Whether or not this image is a bitmask.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
- class vitessce.wrappers.OmeZarrWrapper(img_path=None, img_url=None, name='', is_bitmask=False, **kwargs)[source]
Wrap an OME-NGFF Zarr store by creating an instance of the
OmeZarrWrapper
class.- Parameters
img_path (str) – A local filepath to an OME-NGFF Zarr store.
img_url (str) – A remote URL of an OME-NGFF Zarr store.
**kwargs – Keyword arguments inherited from
AbstractWrapper
Abstract constructor to be inherited by dataset wrapper classes.
- Parameters
out_dir (str) – The path to a local directory used for data processing outputs. By default, uses a temp. directory.
- convert_and_save(dataset_uid, obj_i, base_dir=None)[source]
Fill in the file_def_creators array. Each function added to this list should take in a base URL and generate a Vitessce file definition. If this wrapper is wrapping local data, then create routes and fill in the routes array. This method is void, should not return anything.
vitessce.export
- vitessce.export.export_to_files(config, base_url, out_dir='.')[source]
- Parameters
config (VitessceConfig) – The Vitessce view config to export to files.
out_dir (str) – The path to the output directory. By default, the current directory.
base_url (str) – The URL on which the files will be served.
- Returns
The config as a dict, with urls filled in.
- Return type
- vitessce.export.export_to_s3(config, s3, bucket_name, prefix='')[source]
- Parameters
config (VitessceConfig) – The Vitessce view config to export to S3.
s3 (boto3.resource) – A boto3 S3 resource object with permission to upload to the specified bucket.
bucket_name (str) – The name of the bucket to which to upload.
prefix (str) – The prefix path for the bucket keys (think subdirectory).
- Returns
The config as a dict, with S3 urls filled in.
- Return type
vitessce.data_utils
- vitessce.data_utils.ome.multiplex_img_to_ome_tiff(img_arr, channel_names, output_path, axes='CYX')[source]
Convert a multiplexed image to OME-TIFF.
- vitessce.data_utils.ome.multiplex_img_to_ome_zarr(img_arr, channel_names, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx', channel_colors=None)[source]
Convert a multiplexed image to OME-Zarr v0.3.
- Parameters
img_arr (np.array) – The image as a 3D, 4D, or 5D array.
channel_names (list[str]) – A list of channel names to include in the omero.channels[].label NGFF metadata field.
output_path (str) – The path to save the Zarr store.
img_name (str) – The name of the image to include in the omero.name NGFF metadata field.
chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).
axes (str) – The array axis ordering. By default, “cyx”
channel_colors (dict or None) – Dict mapping channel names to color strings to use for the omero.channels[].color NGFF metadata field. If provided, keys should match channel_names. By default, None to use “FFFFFF” for all channels.
- vitessce.data_utils.ome.rgb_img_to_ome_tiff(img_arr, output_path, img_name='Image', axes='CYX')[source]
Convert an RGB image to OME-TIFF.
- vitessce.data_utils.ome.rgb_img_to_ome_zarr(img_arr, output_path, img_name='Image', chunks=(1, 256, 256), axes='cyx')[source]
Convert an RGB image to OME-Zarr v0.3.
- Parameters
img_arr (np.array) – The image as a 3D array.
output_path (str) – The path to save the Zarr store.
img_name (str) – The name of the image to include in the omero.name NGFF metadata field.
chunks (tuple[int]) – The chunk sizes of each axis. By default, (1, 256, 256).
axes (str) – The array axis ordering. By default, “cyx”
- vitessce.data_utils.anndata.cast_arr(arr)[source]
Try to cast an array to a dtype that takes up less space.
- Parameters
arr (np.array) – The array to cast.
- Returns
The new array.
- Return type
np.array
- vitessce.data_utils.anndata.optimize_adata(adata, obs_cols=None, obsm_keys=None, var_cols=None, varm_keys=None, layer_keys=None, remove_X=False, optimize_X=False, to_dense_X=False, to_sparse_X=False)[source]
Given an AnnData object, optimize for usage with Vitessce and return a new object.
- Parameters
adata (anndata.AnnData) – The AnnData object to optimize.
obs_cols (list[str] or None) – Columns of adata.obs to optimize. Columns not specified will not be included in the returned object.
var_cols (list[str] or None) – Columns of adata.var to optimize. Columns not specified will not be included in the returned object.
obsm_keys (list[str] or None) – Arrays within adata.obsm to optimize. Keys not specified will not be included in the returned object.
varm_keys (list[str] or None) – Arrays within adata.varm to optimize. Keys not specified will not be included in the returned object.
layer_keys (list[str] or None) – Arrays within adata.layers to optimize. Keys not specified will not be included in the returned object.
remove_X (bool) – Should the returned object have its X matrix set to None? By default, False.
optimize_X (bool) – Should the returned object run optimize_arr on adata.X? By default, False.
to_dense_X (bool) – Should adata.X be cast to a dense array in the returned object? By default, False.
to_sparse_X (bool) – Should adata.X be cast to a sparse array in the returned object? By default, False.
- Returns
The new AnnData object.
- Return type
- vitessce.data_utils.anndata.optimize_arr(arr)[source]
Try to cast an array to a dtype that takes up less space, and convert to dense.
- Parameters
arr (np.array) – The array to cast and convert.
- Returns
The new array.
- Return type
np.array
- vitessce.data_utils.anndata.sort_var_axis(adata_X, orig_var_index, full_var_index=None)[source]
Sort the var index by performing hierarchical clustering.
- Parameters
adata_X (np.array) – The matrix to use for clustering. For example, adata.X
orig_var_index (pandas.Index) – The original var index. For example, adata.var.index
full_var_index (pandas.Index or None) – Pass the full adata.var.index to append the var values excluded from sorting, if adata_X and orig_var_index are a subset of the full adata.X matrix. By default, None.
- Returns
The sorted elements of the var index.
- Return type
- vitessce.data_utils.anndata.to_dense(arr)[source]
Convert a sparse array to dense.
- Parameters
arr (np.array) – The array to convert.
- Returns
The converted array (or the original array if it was already dense).
- Return type
np.array
- vitessce.data_utils.anndata.to_diamond(x, y, r)[source]
Convert an (x, y) coordinate to a polygon (diamond) with a given radius.
- vitessce.data_utils.anndata.to_memory(arr)[source]
Try to load a backed AnnData array into memory.
- Parameters
arr (np.array) – The array to load.
- Returns
The loaded array.
- Return type
np.array