OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

KV - V4.3.5

  • OBKV overview
  • Architecture
  • OBKV-Table
    • Introduction to OBKV-Table
      • Overview
      • OBKV-Table operation types
      • Core features of OBKV-Table
      • Differences between replace and insert_or_update
      • Supported value types
      • OBKV-Table data models
    • Use the OBKV-Table Java client
      • Java development guide for OBKV-Table
      • Prepare for development with OBKV-Table
      • Use the OBKV-Table Java client to connect to a cluster
      • Set client parameters
      • Supported client interfaces
      • Use the OBKV-Table Java client
    • Use the OBKV-Table GO client
      • Use the OBKV-Table Go client to connect to a cluster
      • Overview of the Go client
      • Individual API operations
      • Batch operations
      • About queries
      • Aggregation API
      • Filters
    • FAQ
  • OBKV-HBase
    • Overview
    • OBKV-HBase core features
    • Compatibility with HBase
    • Deployment
    • Application development with OBKV-HBase
      • Overview of OBKV-HBase application development
      • Data model
      • Data model design
      • Connect to a cluster using the OBKV-HBase client
      • Migrate HBase business code to OBKV-HBase
      • Data operation examples
      • Delete expired data
    • OBKV-HBase migration guide
    • OBKV-HBase management
      • Overview
      • High availability
      • Security and permissions
      • Monitoring metrics
    • Performance test
    • OBKV-HBase integrations
      • Flink
        • Synchronize data to OBKV-HBase by using Flink
    • Views
    • FAQ

Download PDF

OBKV overview Architecture Overview OBKV-Table operation types Core features of OBKV-Table Differences between replace and insert_or_update Supported value types OBKV-Table data models Java development guide for OBKV-Table Prepare for development with OBKV-Table Use the OBKV-Table Java client to connect to a cluster Set client parameters Supported client interfaces Use the OBKV-Table Java client Use the OBKV-Table Go client to connect to a cluster Overview of the Go client Individual API operations Batch operations About queries Aggregation API Filters FAQ Overview OBKV-HBase core features Compatibility with HBase Deployment Overview of OBKV-HBase application development Data model Data model design Connect to a cluster using the OBKV-HBase client Migrate HBase business code to OBKV-HBase Data operation examples Delete expired data OBKV-HBase migration guide Overview High availability Security and permissions Monitoring metrics Performance test Views FAQ
OceanBase logo

The Unified Distributed Database for the AI Era.

Follow Us
Products
OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
Resources
DocsBlogLive DemosTraining & Certification
Company
About OceanBaseTrust CenterLegalPartnerContact Us
Follow Us

© OceanBase 2026. All rights reserved

Cloud Service AgreementPrivacy PolicySecurity
Contact Us
Document Feedback
  1. Documentation Center
  2. OceanBase Database
  3. KV
  4. V4.3.5
iconOceanBase Database
KV - V 4.3.5
SQL
KV
  • V 4.3.5

Data model

Last Updated:2025-09-08 11:41:13  Updated
share
What is on this page
HBase model
Table
Row
Column family
Column qualifier
Timestamp
Cell
OBKV-HBase data model
Database object types
Implementation of the OBKV-HBase model
Application scenarios of the HBase model
Schemaless
Partial update
Wide table with sparse columns
Multi-version
TTL

folded

share

HBase model

The HBase model differs greatly from the relational model. The following figure shows the HBase model in the form of a table for your easy understanding.

hbase-data-model

Table

Tables in HBase are similar to those in relational databases.

Row

Rows in HBase are similar to those in relational databases, but the two still have the following differences in terms of the data model:

  • In relational databases, a rowkey corresponds to a row of physical data, and all rows share the same schema. HBase is a schemaless database where the schemas of rows can be independent of each other.
  • HBase supports multiple data versions. You can specify the data version of a row when you query data. In the following example, the K column stores the primary key of an HBase table. A row whose primary key is defaultKey1 in the HBase model is stored as five rows in OBKV. The row has four columns and data in the defaultColumn3 column has two versions.
+--------------+----------------+----------------+--------------+
| K            | Q              | T              | V            |
+--------------+----------------+----------------+--------------+
| defaultKey1  | defaultColumn3 | -1715698988310 | defaultValue |
| defaultKey1  | defaultColumn6 | -1715698988310 | defaultValue |
| defaultKey1  | defaultColumn9 | -1715698988310 | defaultValue |
| defaultKey1  | defaultColumn1 | -1715698965577 | defaultValue |
| defaultKey1  | defaultColumn3 | -1715698965577 | defaultValue |
+--------------+----------------+----------------+--------------+

Column family

The concept of column family does not exist in relational databases. It is similar to the concept of vertical partition in some relational databases. A column family is a collection of columns whose values are physically stored together.
The HBase model supports an infinitely wide table and stores frequently accessed values in a column family to significantly improve the performance.

Column qualifier

A relational database has a fixed schema, which means that all rows have the same number of columns and the columns in the same position of rows are of the same data type. HBase is a schemaless database where each row can have a different number of columns and different column types.
In the following example, two HBase rows are inserted into OBKV. The defaultKey1 row has three columns and the defaultKey11 row has two columns.

+--------------+----------------+----------------+--------------+
| K            | Q              | T              | V            |
+--------------+----------------+----------------+--------------+
| defaultKey1  | defaultColumn3 | -1715698988310 | defaultValue |
| defaultKey1  | defaultColumn6 | -1715698988310 | defaultValue |
| defaultKey1  | defaultColumn9 | -1715698988310 | defaultValue |
| defaultKey11 | defaultColumn1 | -1715698965577 | defaultValue |
| defaultKey11 | defaultColumn2 | -1715698965577 | defaultValue |
+--------------+----------------+----------------+--------------+

Timestamp

Each value stored in HBase has a version number, which is identified by a timestamp. You can explicitly specify a timestamp when you write data. If you do not specify a timestamp, the timestamp when data is inserted into the storage engine is used by default.

In the following example, the T column indicates the timestamp. Considering that newly inserted data must be displayed at the top, the model layer of OBKV-HBase stores the negative value of each timestamp.

+-------------+----------------+----------------+--------------+
| K           | Q              | T              | V            |
+-------------+----------------+----------------+--------------+
| defaultKey1 | defaultColumn1 | -1715672642104 | defaultValue |

Cell

In HBase, a value of a specific version corresponds to a cell. A cell can be uniquely identified by a tuple of table, row, column family, column qualifier, and timestamp.

As shown in the preceding example, each row stored in OBKV is a cell in HBase.

OBKV-HBase data model

Database object types

An OBKV cluster is actually an OceanBase cluster for a MySQL tenant.

Database objects in OceanBase Database's MySQL mode include tables, views, indexes, and partitions.

In MySQL mode, a user can connect to a database and access and manage database objects after being granted the required privileges. A database is a collection of database objects used for privilege management and namespace-based resource isolation.

The following table briefly describes the database objects in OceanBase Database's MySQL mode.

For more information about database objects, see Overview of database objects.

Object type Description
Database A collection of database objects used for privilege management and namespace-based resource isolation.
Table The most basic storage unit in a database. A table is organized into rows and columns.
Index An index sorts data of one or more columns. Indexes provide fast access to specified rows in a table. For example, you can search for information about an employee by last name based on the created index. An index helps you improve query efficiency.
Partition OceanBase Database can split the data of a table into different groups based on some rules. Data in the same group is stored in the same physical area. A table whose data is split into different groups is called a partitioned table. Tables in OceanBase Database are horizontally partitioned. Each partition contains some data rows. Partitioning methods are classified into HASH partitioning, RANGE partitioning, LIST partitioning, and others based on the mapping relationships between data and partitions. Each partition can be divided into several subpartitions from different dimensions. For example, you can partition the transaction table into several HASH partitions based on the user ID. Then, you can partition each HASH partition into several RANGE partitions based on the transaction time.
Table group A collection of tables. It is a logical concept. By default, data is randomly distributed to the tables in a table group. By defining a table group, you can control the physical closeness among a group of tables.

Implementation of the OBKV-HBase model

This section takes the following table creation statement as an example to describe the implementation of the OBKV-HBase data model.

CREATE TABLEGROUP htable1;

create table htable1$family1 (
  K varbinary(1024),
  Q varbinary(256),
  T bigint,
  V varbinary(1048576) NOT NULL,
  primary key(K, Q, T))
TABLEGROUP =  htable1
partition by key(K) partitions 97;

OBKV-HBase uses the following mapping strategies to implement the HBase model:

  • OBKV-HBase maps a table in HBase as a table group in OceanBase Database.
  • OBKV-HBase maps a column family in HBase as a normal table in OceanBase Database.

Assume that HBase has a table named htable1 with a column family named family1. You need to create a table group named htable1 and a normal table named htable1$family1 in OceanBase Database, and bind the table with the table group. A normal table is named in the format of TableGroupName$FamilyName.

Notice

At present, OBKV-HBase does not support multiple column families. One table group can be bound with only one table.

Though OBKV-HBase supports the schemaless feature of HBase, any data model implemented based on OceanBase Database have schemas. The preceding table creation statement is the physical storage model of OBKV-HBase.

where:

  • htable1 specifies the name of the HBase table. You can use a custom name.
  • family1 specifies the name of the column family. You can use a custom name.

    Note

    The table name and column family name are joined by using a dollar sign ($) as the table name in OceanBase Database.

  • The K column stores the rowkeys of the HBase table.
  • The Q column stores column qualifiers.
  • The T column stores timestamps, which are the number of milliseconds since 1970-01-01 UTC.
  • The V column stores values of the varbinary type, whose maximum length is 1 MB. If the length is insufficient, the longblob type can be used.
  • The K, Q, and T columns comprise a composite primary key to identify a cell in the HBase model.

Notice

The column names are fixed to K, Q, T, and V, and cannot be modified.

In the relational table created by using the preceding statement in OceanBase Database, data from multiple columns of a single row in the HBase table is stored in adjacent rows. Each of the rows actually stores a cell, which is a <row, column family, column qualifier, timestamp, value> tuple, in the HBase table.

Application scenarios of the HBase model

The HBase model has more benefits than the relational model in some scenarios.

Schemaless

Schemas must be defined in conventional relational databases. Although you can change a schema after you define it, such change may require data rebuilding in most scenarios, which can place heavy load on the database.

HBase is a schemaless database that stores the primary key, column names, and column values. It does not require each row to have the same column definitions.

For example, a business system whose upstream is Hive usually chooses HBase to store the data generated after offline computing by Hive. The reason why the business system does not choose a relational database such as MySQL is that the schema in a Hive task dynamically changes. It is impossible to perform DDL operations for schema change on the MySQL database every day.

Partial update

To update data in a conventional relational database, you need to query existing records in the database first. In other words, each update involves at least one read operation and one write operation on the database.

HBase splits one row into multiple cells for independent storage and provides an API with the semantics of PUT (overwrite). This way, you can update a single cell without the need to query data.

For example, in a business system for financial risk control, characteristics of some users will be partially modified after big data computing every day. You can use HBase in this scenario to achieve the optimal update performance because only updated cells need to be put into HBase. As we all know, a database built based on the LSM-tree architecture is write-friendly.

Wide table with sparse columns

For example, an app profiles users in many dimensions, which can change frequently. A specific user of this app usually matches only a few profiling dimensions. If the relational model is used to describe the scenario, it is a typical case of a wide table with sparse columns. Conventional relational databases are strongly schema-oriented and cannot efficiently handle wide tables with sparse columns due to several reasons. For example, the database needs to maintain many column values in each row and even has to maintain an empty mark for a column without a value.

HBase is suitable for wide tables with sparse columns. HBase stores only non-empty columns and is not limited by a schema, contributing to high efficiency in storage, writes, and queries.

Multi-version

Multi-version is very useful in some scenarios. For example, several recent login/consumption records of a user are often required in a risk control scenario. Business personnel usually need to query the multi-version data records of a specific user to assist decision making, and maintain only a certain number of data versions in the database. To achieve this in HBase, they only need to specify the data versions during the query and table creation. In a relational database, however, they need to redesign the schema, for example, add a timestamp column, to store multiple versions of data. It is a challenging task to maintain a certain number of data versions in a relational database.

TTL

HBase supports deleting expired data based on table-level or cell-level time-to-live (TTL). It provides high resource utilization and deletion efficiency. In a relational database, you can implement TTL through an external component. However, expired data is invisible to business personnel, the impact of deleting expired data on online business is uncontrollable, and the external component needs to consume resources of the business system and makes the business architecture more complex.

Previous topic

Overview of OBKV-HBase application development
Last

Next topic

Data model design
Next
What is on this page
HBase model
Table
Row
Column family
Column qualifier
Timestamp
Cell
OBKV-HBase data model
Database object types
Implementation of the OBKV-HBase model
Application scenarios of the HBase model
Schemaless
Partial update
Wide table with sparse columns
Multi-version
TTL