What is BD5?
BD5 is an open binary format for representing quantitative data of biological dynamics. It is based on HDF5 (https://support.hdfgroup.org/HDF5/) which is developed by the HDF Group open source consortium. HDF5 is a data model, library, and file format for storing and managing data. Therefore, all the open source tools available for accessing and retrieving HDF5 are also applicable to BD5.
More details can be found in the paper below:
- Koji Kyoda, Kenneth H. L. Ho, Yukako Tohsato, Hiroya Itoga, Shuichi Onami (2020)
BD5: An open HDF5-based data format to represent quantitative biological dynamics data, PLOS ONE, Volume 15, Issue 8, August 2020, Pages e0237468, https://doi.org/10.1371/journal.pone.0237468.
Summary of BD5
BD5 has one container (HDF datagroup) named data group. It includes
- scaleUnit dataset for the definition of spatial and time scale and unit,
- objectDef dataset for the definition of biological objects,
- featureDef dataset for the definition of feature of interests,
- numbered groups (0, 1, ..., n) corresponding to the number of time series,
- trackInfo dataset for the information of tracking of one object to another.
In scaleUnit dataset, spatial and time scale and unit are defined.
- dimension should be described by "0D", "1D", "2D", "3D", "0D+T", "1D+T", "2D+T", or "3D+T".
- It is followed by dimension, xScale, yScale, zScale and sUnit.
- In the case of time series data, tScale and tUnit should be defined.
- dimension, sUnit, tUnit must be
string
datatype, and xScale, yScale, zScale, and tScale can be one of the datatypes:int
,uint
,float
,double
andstring
In objectDef dataset, biological objects of interest are defined.
- oID must be
int
datatype, and name must bestring
datatype.
In featureDef dataset, features of interest are defined.
- fID must be
int
datatype, and name must bestring
datatype.
Each numbered group consists of two groups, object and feature groups.
- Each object group has numbered dataset(s) corresponding to the reference number of the biological object predefined in the objectDef dataset.
- Each row of a numbered object includes the ID of the object and the spatiotemporal information of the object.
- BD5 allows five entities, "point", "circle", "sphere", "line" and "face".
- point: each row includes time and xyz (or xy) coordinates.
- circle or sphere: each row includes time, xyz (or xy) coordinates, and radius.
- line or face: each row includes time, xyz (or xy) coordinates, and sID.
- The sID represents the ID of sequence of xyz coordinates. A set of regions or surfaces connected by xyz coordinates having the same sID represents the spatial information of the object.
- ID, entity and label must be
string
datatype. x, y, z, rarius can be one of the datatypes:int
,uint
,float
,double
, andstring
. sID must beint
.
In trackInfo dataset, the information of tracking of one object to another is described. Each row includes "from" and "to" corresponding to the IDs at the neighboring time.
- from and to must be
string
.
Sample Programs and Files
All programs are available at https://github.com/openssbd/BD5_samples/.
Description | Program code |
---|---|
Example of writing line, face, sphere, point and circle objects as numpy array to BD5 files | BD5write_numpy.ipynb |
Example of reading in an image segmentation data stored in a TIF image file, and write it as a circle in BD5 format (with detailed explanation) | BD5write_circle_detail.ipynb |
Example of reading in an image segmentation data stored in a TIF image file, and write it as a circle in BD5 format | BD5write_circle.ipynb |
Example of reading in an image segmentation data stored in a TIF image file, and write it as a point in BD5 format | BD5write_point.ipynb |
Example of reading in an image segmentation data stored in a TIF image file, and write it as a line in BD5 format | BD5write_line.ipynb |
Example of reading in two time series ROIs data stored in TIF image files, and track the objects over the two time points and write that to a BD5 file | BD5write_timeseries.ipynb |
Example of reading in a BD5 file with time series information. It tries to calculate and track the objects over two time points and then append the trackInfo information to a BD5 file | BD5write_trackinfo_point.ipynb |
Example of reading in a BD5 file and writing out a CSV file/Reading in CSV file and writing out a BD5 file | CSVread_BD5write.ipynb |
Example of reading a BD5 file, and calculates the displacement of labeled object | BD5read_displacement.ipynb |
Example of reading a BD5 file, and counts the number of the objects at each time | BD5read_count.ipynb |