Hdf5JavaLib is a pure Java library for reading HDF5 files, released as version 0.2.1. This guide helps users read datasets in the root group using examples from the org.hdf5javalib.examples.read
package. The library reads HDF5 files generated by the C++ HDF5 library, supporting compound datasets, multi-dimensional data, and various datatypes. New in version 0.2.1: enhanced sequential and parallel streaming, array flattening, slicing, reducing along axes, filtering with coordinate lists, and custom type converters for compound datasets.
Add Hdf5JavaLib to your project via Maven Central:
<dependency>
<groupId>org.hdf5javalib</groupId>
<artifactId>hdf5javalib</artifactId>
<version>0.2.1</version>
</dependency>
git clone https://github.com/karlnicholas/Hdf5JavaLib.git
cd Hdf5JavaLib
mvn install
compound_example.h5
, twenty_datasets.h5
, ascii_dataset.h5
, utf8_dataset.h5
, dimensions.h5
, array_datasets.h5
, dsgroup.h5
, scalar.h5
, weatherdata.h5
, tictactoe_4d_state.h5
, all_types_separate.h5
, vlen_types_example.h5
) are used in the examples below.src/test/resources
.SeekableByteChannel
.Hdf5JavaLib supports reading datasets in the root group (/
), including:
CompoundData
with fixed_point
, floating_point
). Supports mapping to custom Java classes or records with custom type converters.BigInteger
, BigDecimal
, Integer
, Long
, HdfFixedPoint
, String
).Float
, Double
, HdfFloatPoint
, String
).HdfString
).Long
, BigInteger
, HdfTime
, String
).BitSet
, HdfBitField
, String
).HdfCompound
, custom classes/records, String
).byte[]
, HdfOpaque
, String
).byte[]
, HdfReference
, String
).HdfEnum
, String
).HdfArray
, HdfData[]
, String
).HdfVariableLength
, Object
, String
).FlattenedArrayUtils
.Use the HdfFileReader
class (in org.hdf5javalib.hdfjava
) to read HDF5 files. The TypedDataSource
class provides typed data access (scalar, vector, matrix, flattened) and streaming. HdfDisplayUtils
simplifies data display. The org.hdf5javalib.examples.read
package demonstrates reading various datasets with advanced processing.
This example reads a compound dataset (/CompoundData
) from an HDF5 file, mapping it to a custom Java record with fields like fixed-point, floating-point, time, string, bitfield, opaque, nested compound, reference, enum, array, and variable-length data. It demonstrates streaming and counting rows.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.dataclass.*;
import org.hdf5javalib.datasource.TypedDataSource;
import org.hdf5javalib.hdfjava.HdfDataFile;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import java.io.IOException;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.BitSet;
import java.util.concurrent.atomic.AtomicInteger;
public class CompoundRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(CompoundRead.class);
public static void main(String[] args) {
new CompoundRead().run();
}
private void run() {
try {
Path filePath = getResourcePath("compound_example.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
try (HdfDataset dataSet = reader.getDataset("/CompoundData").orElseThrow()) {
displayData(channel, dataSet, reader);
}
log.debug("File BTree: {} ", reader.getBTree());
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
private Path getResourcePath(String fileName) {
String resourcePath = getClass().getClassLoader().getResource(fileName).getPath();
if (System.getProperty("os.name").toLowerCase().contains("windows") && resourcePath.startsWith("/")) {
resourcePath = resourcePath.substring(1);
}
return Paths.get(resourcePath);
}
public record Record(
Integer fixed_point, // int32_t fixed_point
Float floating_point, // float floating_point
Long time, // uint64_t time (Class 2 Time)
String string, // char string[16]
BitSet bit_field, // uint8_t bit_field
HdfOpaque opaque, // uint8_t opaque[4]
Compound compound, // nested struct compound
HdfReference reference, // hobj_ref_t reference
HdfEnum enumerated, // int enumerated (LOW, MEDIUM, HIGH)
HdfArray array, // int array[3]
HdfVariableLength variable_length // hvl_t variable_length
) {
public record Compound(
Integer nested_int, // int16_t nested_int
Double nested_double // double nested_double
) {
}
public enum Level {
LOW(0), MEDIUM(1), HIGH(2);
private final int value;
Level(int value) {
this.value = value;
}
public int getValue() {
return value;
}
}
}
public void displayData(SeekableByteChannel seekableByteChannel, HdfDataset dataSet, HdfDataFile hdfDataFile) throws IOException {
System.out.println("Ten Rows:");
new TypedDataSource<>(seekableByteChannel, hdfDataFile, dataSet, HdfCompound.class)
.streamVector()
.limit(10)
.forEach(c -> System.out.println("Row: " + c.getMembers()));
AtomicInteger atomicInteger = new AtomicInteger(0);
new TypedDataSource<>(seekableByteChannel, hdfDataFile, dataSet, HdfCompound.class)
.streamVector()
.forEach(c -> {
c.getMembers().toString();
atomicInteger.incrementAndGet();
});
System.out.println("DONE: " + atomicInteger.get());
}
}
Output (example):
Ten Rows:
Row: [fixed_point=0, floating_point=0.0, time=0, string=string0, bit_field=00000001, opaque=opaque0, compound=[nested_int=0, nested_double=0.0], reference=reference0, enumerated=LOW, array=[0, 0, 0], variable_length=[0, 1, 2, 3, 4]]
Row: [fixed_point=1, floating_point=1.0, time=1, string=string1, bit_field=00000010, opaque=opaque1, compound=[nested_int=1, nested_double=1.0], reference=reference1, enumerated=MEDIUM, array=[1, 1, 1], variable_length=[1, 2, 3, 4, 5]]
...
DONE: 10000
Note: Replace getResourcePath
with your own file loading logic (e.g., Files.newByteChannel(Paths.get("path/to/compound_example.h5"), StandardOpenOption.READ)
) for custom HDF5 files.
This example reads multiple scalar datasets from twenty_datasets.h5
, displaying values as Long
.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import org.hdf5javalib.utils.HdfDisplayUtils;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import static org.hdf5javalib.utils.HdfReadUtils.getResourcePath;
public class TwentyScalarRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(TwentyScalarRead.class);
public static void main(String[] args) throws Exception {
new TwentyScalarRead().run();
}
private void run() throws Exception {
Path filePath = getResourcePath("twenty_datasets.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
for (HdfDataset dataSet : reader.getDatasets()) {
try (HdfDataset ds = dataSet) {
HdfDisplayUtils.displayScalarData(channel, ds, Long.class, reader);
}
}
log.debug("Superblock: {} ", reader.getSuperblock());
}
}
}
Output (example):
Dataset0: 123
Dataset1: 456
...
Dataset19: 789
Note: Use your own HDF5 file path if not using resources.
This example reads vectors of ASCII and UTF-8 strings from ascii_dataset.h5
and utf8_dataset.h5
.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import org.hdf5javalib.utils.HdfDisplayUtils;
import java.io.FileInputStream;
import java.nio.channels.FileChannel;
import java.util.Objects;
public class StringRead {
public static void main(String[] args) throws Exception {
new StringRead().run();
}
private void run() throws Exception {
String filePath = Objects.requireNonNull(StringRead.class.getResource("/ascii_dataset.h5")).getFile();
try (FileInputStream fis = new FileInputStream(filePath)) {
FileChannel channel = fis.getChannel();
HdfFileReader reader = new HdfFileReader(channel).readFile();
try (HdfDataset dataSet = reader.getDataset("/strings").orElseThrow()) {
HdfDisplayUtils.displayVectorData(channel, dataSet, String.class, reader);
}
}
filePath = Objects.requireNonNull(StringRead.class.getResource("/utf8_dataset.h5")).getFile();
try (FileInputStream fis = new FileInputStream(filePath)) {
FileChannel channel = fis.getChannel();
HdfFileReader reader = new HdfFileReader(channel).readFile();
try (HdfDataset dataSet = reader.getDataset("/strings").orElseThrow()) {
HdfDisplayUtils.displayVectorData(channel, dataSet, String.class, reader);
}
}
}
}
Output (example):
["Hello", "World", "HDF5"]
["UTF-8 String1", "UTF-8 String2"]
Note: Replace resource loading with FileChannel.open(Paths.get("path/to/file.h5"))
for custom files.
This example reads scalar, 1D, 2D, and array datasets from array_datasets.h5
, displaying their data.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.datasource.TypedDataSource;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import static org.hdf5javalib.utils.HdfDisplayUtils.displayData;
import static org.hdf5javalib.utils.HdfReadUtils.getResourcePath;
public class DimensionsRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(DimensionsRead.class);
public static void main(String[] args) {
new DimensionsRead().run();
}
private void run() {
try {
Path filePath = getResourcePath("array_datasets.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
log.debug("File BTree: {} ", reader.getBTree());
for (HdfDataset dataSet : reader.getDatasets()) {
displayData(channel, dataSet, reader);
}
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
Output (example):
Scalar dataset: 3.14
Vector dataset: [1.0, 2.0, 3.0]
Matrix dataset:
1.0 2.0
3.0 4.0
Note: Adapt getResourcePath
for your file system.
This example reads fixed-point data from scalar, matrix, and 4D datasets, demonstrating streaming, flattening, slicing, and filtering.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.dataclass.HdfFixedPoint;
import org.hdf5javalib.datasource.TypedDataSource;
import org.hdf5javalib.hdfjava.HdfDataFile;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import org.hdf5javalib.utils.FlattenedArrayUtils;
import java.io.IOException;
import java.math.BigDecimal;
import java.math.BigInteger;
import java.math.RoundingMode;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static org.hdf5javalib.utils.HdfReadUtils.getResourcePath;
public class FixedPointRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(FixedPointRead.class);
public static void main(String[] args) throws Exception {
new FixedPointRead().run();
}
void run() throws Exception {
Path filePath = getResourcePath("scalar.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
log.debug("File BTree: {} ", reader.getBTree());
tryScalarDataSpliterator(channel, reader, reader.getDatasets().get(0));
}
filePath = getResourcePath("weatherdata.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
tryMatrixSpliterator(channel, reader, reader.getDataset("/weatherdata").orElseThrow());
}
filePath = getResourcePath("tictactoe_4d_state.h5");
try (SeekableByteChannel channel = Files.newByteChannel(filePath, StandardOpenOption.READ)) {
HdfFileReader reader = new HdfFileReader(channel).readFile();
display4DData(channel, reader, reader.getDataset("/game").orElseThrow());
}
}
void tryScalarDataSpliterator(SeekableByteChannel channel, HdfDataFile hdfDataFile, HdfDataset dataSet) throws IOException {
TypedDataSource<BigInteger> dataSource = new TypedDataSource<>(channel, hdfDataFile, dataSet, BigInteger.class);
BigInteger allData = dataSource.readScalar();
System.out.println("Scalar dataset name = " + dataSet.getObjectName());
System.out.println("Scalar readAll stats = " + Stream.of(allData)
.collect(Collectors.summarizingInt(BigInteger::intValue)));
System.out.println("Scalar streaming list = " + dataSource.streamScalar().toList());
System.out.println("Scalar parallelStreaming list = " + dataSource.parallelStreamScalar().toList());
}
void tryMatrixSpliterator(SeekableByteChannel fileChannel, HdfDataFile hdfDataFile, HdfDataset dataSet) throws IOException {
TypedDataSource<BigDecimal> dataSource = new TypedDataSource<>(fileChannel, hdfDataFile, dataSet, BigDecimal.class);
BigDecimal[][] allData = dataSource.readMatrix();
System.out.println("Matrix readAll() = ");
for (BigDecimal[] allDatum : allData) {
for (BigDecimal bigDecimal : allDatum) {
System.out.print(bigDecimal.setScale(2, RoundingMode.HALF_UP) + " ");
}
System.out.println();
}
}
void display4DData(SeekableByteChannel fileChannel, HdfDataFile hdfDataFile, HdfDataset dataSet) throws IOException {
TypedDataSource<Integer> dataSource = new TypedDataSource<>(fileChannel, hdfDataFile, dataSet, Integer.class);
int[] shape = dataSource.getShape();
Integer[][][] step0 = (Integer[][][]) FlattenedArrayUtils.sliceStream(
dataSource.streamFlattened(), dataSource.getShape(),
new int[][]{{}, {}, {}, {0}}, Integer.class);
System.out.println("Step 0:");
for (int x = 0; x < shape[0]; x++) {
for (int y = 0; y < shape[1]; y++) {
for (int z = 0; z < shape[2]; z++) {
Integer value = step0[x][y][z];
System.out.printf("(%d %d %d) %s%n", x, y, z, value);
}
}
}
}
}
Output (example):
Scalar dataset name = /scalar
Scalar readAll stats = IntSummaryStatistics{count=1, sum=42, min=42, average=42.000000, max=42}
Scalar streaming list = [42]
Scalar parallelStreaming list = [42]
Matrix readAll() =
1.00 2.00
3.00 4.00
Step 0:
(0 0 0) 0
(0 0 1) 1
...
Note: Use your own file paths for scalar.h5
, weatherdata.h5
, and tictactoe_4d_state.h5
.
This example reads datasets for various datatypes (fixed-point, float, time, string, etc.) from all_types_separate.h5
, using native HDF5 types, Java types, and a custom compound converter.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.dataclass.*;
import org.hdf5javalib.datatype.CompoundDatatype;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import org.hdf5javalib.utils.HdfDisplayUtils;
import java.io.FileInputStream;
import java.math.BigDecimal;
import java.math.BigInteger;
import java.nio.channels.FileChannel;
import java.util.BitSet;
import java.util.Map;
import java.util.Objects;
import java.util.stream.Collectors;
public class SeparateTypesRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(SeparateTypesRead.class);
public static void main(String[] args) throws Exception {
new SeparateTypesRead().run();
}
private void run() throws Exception {
String filePath = Objects.requireNonNull(this.getClass().getResource("/all_types_separate.h5")).getFile();
try (FileInputStream fis = new FileInputStream(filePath)) {
FileChannel channel = fis.getChannel();
HdfFileReader reader = new HdfFileReader(channel).readFile();
try (HdfDataset dataSet = reader.getDataset("/fixed_point").orElseThrow()) {
HdfDisplayUtils.displayScalarData(channel, dataSet, HdfFixedPoint.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, Integer.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, Long.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, BigInteger.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, BigDecimal.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, String.class, reader);
}
try (HdfDataset dataSet = reader.getDataset("/float").orElseThrow()) {
HdfDisplayUtils.displayScalarData(channel, dataSet, HdfFloatPoint.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, Float.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, Double.class, reader);
HdfDisplayUtils.displayScalarData(channel, dataSet, String.class, reader);
}
try (HdfDataset dataSet = reader.getDataset("/compound").orElseThrow()) {
HdfDisplayUtils.displayScalarData(channel, dataSet, HdfCompound.class, reader);
CompoundDatatype.addConverter(CustomCompound.class, (bytes, compoundDataType) -> {
Map<String, HdfCompoundMember> nameToMember = compoundDataType.getInstance(HdfCompound.class, bytes)
.getMembers()
.stream()
.collect(Collectors.toMap(m -> m.getDatatype().getName(), m -> m));
return CustomCompound.builder()
.name("Name")
.someShort(nameToMember.get("a").getInstance(Short.class))
.someDouble(nameToMember.get("b").getInstance(Double.class))
.build();
});
HdfDisplayUtils.displayScalarData(channel, dataSet, CustomCompound.class, reader);
}
log.info("Superblock: {}", reader.getSuperblock());
}
}
public static class Compound {
private Short a;
private Double b;
public Short getA() { return a; }
public void setA(Short a) { this.a = a; }
public Double getB() { return b; }
public void setB(Double b) { this.b = b; }
}
public static class CustomCompound {
private String name;
private Short someShort;
private Double someDouble;
private CustomCompound() {}
public static Builder builder() { return new Builder(); }
public String getName() { return name; }
public void setName(String name) { this.name = name; }
public Short getSomeShort() { return someShort; }
public void setSomeShort(Short someShort) { this.someShort = someShort; }
public Double getSomeDouble() { return someDouble; }
public void setSomeDouble(Double someDouble) { this.someDouble = someDouble; }
public static class Builder {
private final CustomCompound instance = new CustomCompound();
public Builder name(String name) {
instance.setName(name);
return this;
}
public Builder someShort(Short someShort) {
instance.setSomeShort(someShort);
return this;
}
public Builder someDouble(Double someDouble) {
instance.setSomeDouble(someDouble);
return this;
}
public CustomCompound build() { return instance; }
}
}
}
Output (example):
HdfFixedPoint: 42
Integer: 42
Long: 42
BigInteger: 42
BigDecimal: 42
String: 42
HdfFloatPoint: 3.14
Float: 3.14
Double: 3.14
String: 3.14
HdfCompound: [a=10, b=20.5]
CustomCompound: Name=Name, someShort=10, someDouble=20.5
Note: Only a subset of datatypes is shown for brevity.
This example reads variable-length datasets from vlen_types_example.h5
, displaying them as HdfVariableLength
, String
, and Object
.
package org.hdf5javalib.examples.read;
import org.hdf5javalib.dataclass.HdfVariableLength;
import org.hdf5javalib.hdfjava.HdfDataset;
import org.hdf5javalib.hdfjava.HdfFileReader;
import org.hdf5javalib.utils.HdfDisplayUtils;
import java.io.FileInputStream;
import java.nio.channels.FileChannel;
import java.util.Objects;
public class VLenTypesRead {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(VLenTypesRead.class);
public static void main(String[] args) throws Exception {
new VLenTypesRead().run();
}
private void run() throws Exception {
String filePath = Objects.requireNonNull(this.getClass().getResource("/vlen_types_example.h5")).getFile();
try (FileInputStream fis = new FileInputStream(filePath)) {
FileChannel channel = fis.getChannel();
HdfFileReader reader = new HdfFileReader(channel).readFile();
for (HdfDataset dataSet : reader.getDatasets()) {
try (HdfDataset ds = dataSet) {
System.out.println("Dataset name: " + ds.getObjectName());
HdfDisplayUtils.displayScalarData(channel, ds, HdfVariableLength.class, reader);
HdfDisplayUtils.displayScalarData(channel, ds, String.class, reader);
HdfDisplayUtils.displayScalarData(channel, ds, Object.class, reader);
}
}
log.info("Superblock: {}", reader.getSuperblock());
}
}
}
Output (example):
Dataset name: /vlen_dataset
HdfVariableLength: [1, 2, 3, 4]
String: [1, 2, 3, 4]
Object: [1, 2, 3, 4]
Note: Use FileInputStream
or Files.newByteChannel
for your own files.
pom.xml
.src/main/resources
or a file system path.mvn compile
java -cp target/classes org.hdf5javalib.examples.read.CompoundRead
src/test/resources
.mvn test
mvn compile
java -cp target/classes org.hdf5javalib.examples.read.CompoundRead
Files.newByteChannel
or FileChannel.open
to create a SeekableByteChannel
.Help improve Hdf5JavaLib by reporting issues at GitHub Issues. Please include:
Visit https://www.hdf5javalib.org for updates and resources.