OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

KV - V4.3.5

  • OBKV overview
  • Architecture
  • OBKV-Table
    • Introduction to OBKV-Table
      • Overview
      • OBKV-Table operation types
      • Core features of OBKV-Table
      • Differences between replace and insert_or_update
      • Supported value types
      • OBKV-Table data models
    • Use the OBKV-Table Java client
      • Java development guide for OBKV-Table
      • Prepare for development with OBKV-Table
      • Use the OBKV-Table Java client to connect to a cluster
      • Set client parameters
      • Supported client interfaces
      • Use the OBKV-Table Java client
    • Use the OBKV-Table GO client
      • Use the OBKV-Table Go client to connect to a cluster
      • Overview of the Go client
      • Individual API operations
      • Batch operations
      • About queries
      • Aggregation API
      • Filters
    • FAQ
  • OBKV-HBase
    • Overview
    • OBKV-HBase core features
    • Compatibility with HBase
    • Deployment
    • Application development with OBKV-HBase
      • Overview of OBKV-HBase application development
      • Data model
      • Data model design
      • Connect to a cluster using the OBKV-HBase client
      • Migrate HBase business code to OBKV-HBase
      • Data operation examples
      • Delete expired data
    • OBKV-HBase migration guide
    • OBKV-HBase management
      • Overview
      • High availability
      • Security and permissions
      • Monitoring metrics
    • Performance test
    • OBKV-HBase integrations
      • Flink
        • Synchronize data to OBKV-HBase by using Flink
    • Views
    • FAQ

Download PDF

OBKV overview Architecture Overview OBKV-Table operation types Core features of OBKV-Table Differences between replace and insert_or_update Supported value types OBKV-Table data models Java development guide for OBKV-Table Prepare for development with OBKV-Table Use the OBKV-Table Java client to connect to a cluster Set client parameters Supported client interfaces Use the OBKV-Table Java client Use the OBKV-Table Go client to connect to a cluster Overview of the Go client Individual API operations Batch operations About queries Aggregation API Filters FAQ Overview OBKV-HBase core features Compatibility with HBase Deployment Overview of OBKV-HBase application development Data model Data model design Connect to a cluster using the OBKV-HBase client Migrate HBase business code to OBKV-HBase Data operation examples Delete expired data OBKV-HBase migration guide Overview High availability Security and permissions Monitoring metrics Performance test Views FAQ
OceanBase logo

The Unified Distributed Database for the AI Era.

Follow Us
Products
OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
Resources
DocsBlogLive DemosTraining & Certification
Company
About OceanBaseTrust CenterLegalPartnerContact Us
Follow Us

© OceanBase 2026. All rights reserved

Cloud Service AgreementPrivacy PolicySecurity
Contact Us
Document Feedback
  1. Documentation Center
  2. OceanBase Database
  3. KV
  4. V4.3.5
iconOceanBase Database
KV - V 4.3.5
SQL
KV
  • V 4.3.5

Java development guide for OBKV-Table

Last Updated:2025-12-17 07:17:30  Updated
share
What is on this page
Recommendations
Batch operations
Indexes
Partitioning strategy
Partitioning key design
Design the number of partitions

folded

share

This topic outlines key considerations and best practices for developing with OBKV-Table. Before you begin, make sure you understand your business use cases and the product positioning of OBKV-Table.

Note

OBKV-Table supports only Java clients.

Recommendations

Batch operations

If the data in a batch is not logically related, avoid spanning multiple partitions with a single batch. The OBKV-Table client provides a partitioning interface. When using batch operations, calculate the partition ID first, then group and submit batches for each partition separately. This approach enables single-node transactions with minimal performance overhead and simplifies retry logic.

Indexes

Suggestions on using indexes:

  • Avoid global indexes unless necessary. Maintaining global indexes during insert, update, or delete operations on the primary table incurs considerable distributed transaction overhead.
  • Although local indexes are less costly to maintain than global indexes, keep their number under control. More indexes mean higher maintenance costs.

Partitioning strategy

Here we use key partitioning as an example.

Partitioning key design

The primary goal when designing partitioning keys in OBKV-Table is to ensure that requests for your business scenario do not cross partitions. Some guidelines for choosing partition keys:

  • The partitioning key is typically the primary key or a prefix of the primary key.

    • If the partitioning key is the primary key: For example, in an order scenario, you might set orderid as the primary key. If queries are always exact lookups by orderid, use orderid as the partitioning key for efficient access to all or part of the information for a specific order.
    • If the partitioning key is a primary key prefix: For example, in an order scenario, you might use a composite primary key of userid and orderid. If queries commonly scan by userid to retrieve all or some orders for a user, use userid as the partition key.
  • Queries in your business scenarios should always specify the partitioning key.

Design the number of partitions

Why the number of partitions matters

  • Changing the number of partitions requires modifying the data model (the table schema or DDL) and redistributing data, which is costly.
  • Avoid having too few partitions: The database's balancing algorithm ensures that the difference in partition counts between any two nodes does not exceed one. For example, with 3 nodes and 7 partitions, the distribution could be <3,2,2>, resulting in a partition imbalance of (3-2)/3 = 33%. Assuming no data skew, the number of partitions is nearly proportional to both storage and access volume, and partition imbalance will also lead to uneven resource usage across nodes. Therefore, too few partitions is not recommended.
  • Avoid having too many partitions: You can think of each partition as a mini-database, with multiple mini-databases within a single database. Each mini-database manages its own storage and data access independently. This structure means that excessive mini-databases can prevent optimal sharing of storage space, and scan operations may span multiple databases, causing some performance loss. Batch operations may also be harder to optimize for maximum performance at the storage layer. So, having too many partitions is also not recommended.

How partition count affects performance

Generally, the number of partitions does not significantly impact performance. However, for tasks such as major compaction, migration, or backup—which operate at the partition level—partition count does matter. Having too few partitions makes each partition very large, which can increase task execution time and require more resources.

How to choose partition count

  • For capacity planning, we recommend keeping the data size per partition per replica below 100 GB. When selecting a partition count, choose prime numbers like 23, 59, 97, 193, 389, or 997. For example, with three replicas and 24 TB of storage, if each partition per replica is 8 TB, you will need at least 80 partitions. Based on the recommended values, choose 97 partitions.
  • Changing the partition count requires DDL changes, which incurs some overhead. Plan your partitioning strategy in advance to avoid frequent changes. For example, if your current storage is 24 TB but you expect it to grow to 50 TB within a year, design for 50 TB and choose 193 partitions.
  • Do not use too many partitions; generally, do not exceed 997.

How to detect data skew across partitions

Key partitioning currently relies on the MurmurHash algorithm, which hashes the partitioning key and assigns data to the appropriate partition. Except in extreme cases—such as 100 million rows with 10 million sharing the same partitioning key—data skew is typically not an issue. For example, with 100 million rows and only a few thousand per partitioning key, hashing generally distributes data evenly. With large datasets, the randomness of the hash function ensures data is distributed fairly evenly across partitions.

Previous topic

OBKV-Table data models
Last

Next topic

Prepare for development with OBKV-Table
Next
What is on this page
Recommendations
Batch operations
Indexes
Partitioning strategy
Partitioning key design
Design the number of partitions