Standalone Cloudera-compatible Hive Metastore for local development, implemented in Kotlin and aligned to the Cloudera Spark stack.
Add the dependency:
dependencies {
testImplementation("org.openprojectx.cloudera.hms:junit5:<version>")
}Use ClouderaHiveMetastoreTest.kt:
@ClouderaHiveMetastoreTest(
postgresImage = "postgres:14",
schemaSqlPath = "/hive-schema-3.1.3000.postgres.sql",
logLevel = "DEBUG",
)
class MyMetastoreTestSupported annotation attributes:
postgresImage: overrides the PostgreSQL Testcontainers imageschemaSqlPath: accepts either a filesystem path or a classpath resource pathlogLevel: configures the generated HMS server Log4j 2 root level
Add the dependency:
dependencies {
testImplementation("org.openprojectx.cloudera.hms:testcontainers:<version>")
}Use the default image:
val metastore = ClouderaHiveMetastoreContainer()
.withDatabaseName("metastore_db")
.withDatabaseUser("hive")
.withDatabasePassword("hive-password")Use a custom image explicitly:
val metastore = ClouderaHiveMetastoreContainer.withImage("my-registry/cloudera-hms:test")
.withDatabaseName("metastore_db")
.withDatabaseUser("hive")
.withDatabasePassword("hive-password")Or set CLOUDERA_HMS_TEST_IMAGE and keep using the default constructor.
Typical JUnit 5 and Testcontainers usage:
import org.junit.jupiter.api.Test
import org.testcontainers.junit.jupiter.Container
import org.testcontainers.junit.jupiter.Testcontainers
@Testcontainers
class MyMetastoreContainerTest {
@Container
private val metastore = ClouderaHiveMetastoreContainer()
.withDatabaseName("metastore_db")
.withDatabaseUser("hive")
.withDatabasePassword("hive-password")
@Test
fun testMetastore() {
val thriftUri = metastore.thriftUri()
// Create a HiveMetaStoreClient or your application client here.
}
}Start a local PostgreSQL if you want to run the metastore against a local database:
docker compose up -dBuild the shaded runtime:
GRADLE_USER_HOME=/data/.gradle ./gradlew :runtime:shadowJarBuild the container image:
CLOUDERA_HMS_BASE_IMAGE=your-registry/postgres-jdk17:tag GRADLE_USER_HOME=/data/.gradle ./gradlew :image:jibDockerBuildFor contributor-oriented build commands, version alignment, and verification notes, see CONTRIBUTING.md.
core: metastore runtime, PostgreSQL schema bootstrap, and configuration helpersruntime: shaded standalone runtime jar for launching the metastoreimage: Jib-based container image assembly for a combined PostgreSQL plus Hive metastore runtimejunit5: annotation-driven JUnit 5 support that provisions PostgreSQL and starts the metastore for testshms-tck-core: reusable Java 11-compatible Hive metastore TCK contract and assertionshms-tck: Java 17 TCK implementations for the in-processcoreand shadedruntimeexecutionstestcontainers: JDK 11 Testcontainers wrapper for the built metastore imagespark: Spark-facing TCKs that validate Spark SQL and Iceberg against the metastore
The runtime expects these JVM system properties:
cloudera.hms.hostcloudera.hms.portcloudera.hms.warehouse.dircloudera.hms.jdbc.urlcloudera.hms.jdbc.usercloudera.hms.jdbc.password
Optional properties:
cloudera.hms.jdbc.drivercloudera.hms.initialize-schemacloudera.hms.schema.resourcecloudera.hms.schema.file
cloudera.hms.schema.file takes precedence when you want to supply your own schema SQL.
When starting the server through the Kotlin API, ClouderaHiveMetastoreConfig.kt also exposes:
extraConfigurationfor arbitrary Hive or Hadoop properties that need to exist inside the metastore JVMlogLevelfor generated HMS server logginglogConfigFilefor a complete custom Log4j 2 properties file
Default configuration is defined in ClouderaHiveMetastoreConfig.kt. Server bootstrap happens in HiveMetastoreServerMain.kt.
The junit5 module provides ClouderaHiveMetastoreTest.kt, which starts PostgreSQL plus a metastore process for a test class.
The testcontainers module wraps the built metastore image for integration tests on JDK 11+. The main entry point is ClouderaHiveMetastoreContainer.kt. The module also reuses the shared TCK from hms-tck-core for its own integration coverage.
The image module builds a runnable container image with Jib. The image expects a base image that already includes:
- PostgreSQL
- JDK 17
- the standard PostgreSQL container entrypoint at
/usr/local/bin/docker-entrypoint.sh
Build configuration is environment-variable driven:
CLOUDERA_HMS_BASE_IMAGECLOUDERA_HMS_IMAGECLOUDERA_HMS_IMAGE_TAGS
Runtime configuration is environment-variable driven. The image supports:
POSTGRES_DBPOSTGRES_USERPOSTGRES_PASSWORDPOSTGRES_PORTHMS_HOSTHMS_PORTHMS_WAREHOUSE_DIRHMS_JDBC_URLHMS_JDBC_USERHMS_JDBC_PASSWORDHMS_JDBC_DRIVERHMS_INITIALIZE_SCHEMAHMS_SCHEMA_RESOURCEHMS_SCHEMA_FILEHMS_EXTRA_CONFIG_FILEHMS_EXTRA_CONFHMS_LOG_LEVELJAVA_OPTS
Extra Hive or Hadoop properties can be passed either as newline-delimited HMS_EXTRA_CONF entries or as individual HMS_CONF_* environment variables, where _ maps to . and __ maps to -.