≫ Archive Data

BD5 data format

What is BD5?

BD5 is an open binary format for representing quantitative data of biological dynamics. It is based on HDF5 (https://support.hdfgroup.org/HDF5/) which is developed by the HDF Group open source consortium. HDF5 is a data model, library, and file format for storing and managing data. Therefore, all the open source tools available for accessing and retrieving HDF5 are also applicable to BD5.


More details can be found in the paper below:

  • Koji Kyoda, Kenneth H. L. Ho, Yukako Tohsato, Hiroya Itoga, Shuichi Onami (2020)
    BD5: An open HDF5-based data format to represent quantitative biological dynamics data, PLOS ONE, Volume 15, Issue 8, August 2020, Pages e0237468, https://doi.org/10.1371/journal.pone.0237468.

Summary of BD5

BD5 has one container (HDF datagroup) named data group. It includes

  • scaleUnit dataset for the definition of spatial and time scale and unit,
  • objectDef dataset for the definition of biological objects,
  • featureDef dataset for the definition of feature of interests,
  • numbered groups (0, 1, ..., n) corresponding to the number of time series,
  • trackInfo dataset for the information of tracking of one object to another.

In scaleUnit dataset, spatial and time scale and unit are defined.

  • dimension should be described by "0D", "1D", "2D", "3D", "0D+T", "1D+T", "2D+T", or "3D+T".
  • It is followed by dimension, xScale, yScale, zScale and sUnit.
  • In the case of time series data, tScale and tUnit should be defined.
  • dimension, sUnit, tUnit must be string datatype, and xScale, yScale, zScale, and tScale can be one of the datatypes: int, uint, float, double and string

In objectDef dataset, biological objects of interest are defined.

  • oID must be int datatype, and name must be string datatype.

In featureDef dataset, features of interest are defined.

  • fID must be int datatype, and name must be string datatype.


Each numbered group consists of two groups, object and feature groups.

  • Each object group has numbered dataset(s) corresponding to the reference number of the biological object predefined in the objectDef dataset.
  • Each row of a numbered object includes the ID of the object and the spatiotemporal information of the object.
  • BD5 allows five entities, "point", "circle", "sphere", "line" and "face".
    • point: each row includes time and xyz (or xy) coordinates.
    • circle or sphere: each row includes time, xyz (or xy) coordinates, and radius.
    • line or face: each row includes time, xyz (or xy) coordinates, and sID.
      • The sID represents the ID of sequence of xyz coordinates. A set of regions or surfaces connected by xyz coordinates having the same sID represents the spatial information of the object.
  • ID, entity and label must be string datatype. x, y, z, rarius can be one of the datatypes: int, uint, float, double, and string. sID must be int.


In trackInfo dataset, the information of tracking of one object to another is described. Each row includes "from" and "to" corresponding to the IDs at the neighboring time.

  • from and to must be string.


Sample Programs and Files

All programs are available at https://github.com/openssbd/BD5_samples/.

Description Program code
Example of writing line, face, sphere, point and circle objects as numpy array to BD5 files BD5write_numpy.ipynb
Example of reading in an image segmentation data stored in a TIF image file, and write it as a circle in BD5 format (with detailed explanation) BD5write_circle_detail.ipynb
Example of reading in an image segmentation data stored in a TIF image file, and write it as a circle in BD5 format BD5write_circle.ipynb
Example of reading in an image segmentation data stored in a TIF image file, and write it as a point in BD5 format BD5write_point.ipynb
Example of reading in an image segmentation data stored in a TIF image file, and write it as a line in BD5 format BD5write_line.ipynb
Example of reading in two time series ROIs data stored in TIF image files, and track the objects over the two time points and write that to a BD5 file BD5write_timeseries.ipynb
Example of reading in a BD5 file with time series information. It tries to calculate and track the objects over two time points and then append the trackInfo information to a BD5 file BD5write_trackinfo_point.ipynb
Example of reading in a BD5 file and writing out a CSV file/Reading in CSV file and writing out a BD5 file CSVread_BD5write.ipynb
Example of reading a BD5 file, and calculates the displacement of labeled object BD5read_displacement.ipynb
Example of reading a BD5 file, and counts the number of the objects at each time BD5read_count.ipynb