Data processing tools serve as the bridge between raw data and OceanBase Database. This topic introduces some of the mainstream tools.
OceanBase Developer Center (ODC)
OceanBase Developer Center (ODC) is a database graphical development tool and a collaborative platform for data development and production change management. ODC is available in two editions: desktop and web. The desktop edition focuses on database development capabilities and supports Windows, macOS, and Linux operating systems. It is lightweight and easy to deploy. The web edition provides the same tool capabilities as the desktop edition but also includes management and collaboration features. It focuses on ensuring the security, compliance, and efficiency of database changes.
For more information about how to use ODC, see OceanBase Developer Center (ODC).
ETL tools
Flink
Flink is an open-source framework for large-scale data processing and analysis. Flink CDC is a component that captures database change events on the Flink platform. Together, they provide a powerful real-time data processing and analysis solution.
For more information about how to use Flink, see Use Flink CDC to synchronize data from a MySQL database to OceanBase Database and Use Flink CDC to migrate data from OceanBase Database to a MySQL database.
dbt (Data Build Tool)
dbt (data build tool) is an open-source data transformation tool that uses SQL to convert commands into tables or views. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database.
For more information about how to use dbt, see Analyze data in OceanBase Database by using dbt.
DataWorks
DataWorks is a data development and governance platform on Alibaba Cloud. It provides unified end-to-end processing capabilities for data warehouses, data lakes, and hybrid data lake and warehouse solutions based on big data engines such as MaxCompute, Hologres, EMR, AnalyticDB, and CDP. DataWorks offers various features, including data integration, data development, data modeling, and data analysis. For more information about how to use DataWorks, see DataWorks.
AWS Glue
AWS Glue is a serverless data integration service that helps you easily discover, prepare, move, and integrate data from multiple sources to support analysis, machine learning, and application development. It provides comprehensive data integration capabilities, including data discovery, modern ETL, data cleaning and transformation, and centralized cataloging, all integrated into a single service. AWS Glue does not require you to manage infrastructure. It supports various workloads such as ETL, ELT, and streaming, and can scale on demand to handle any data size or type.
For more information about how to use AWS Glue, see Migrate data from OceanBase Database by using AWS Glue.
Tool selection recommendations
| Tool | Core advantages | Scenarios |
|---|---|---|
| ODC | A dedicated management tool for OceanBase Database, deeply integrated with the ecosystem | Development and O&M of OceanBase Database |
| Flink | Real-time stream processing and high-throughput batch processing | Stream data monitoring, real-time report generation |
| dbt | Standardized data modeling, traceable analysis processes | Modeling of analytical data warehouses (such as OceanBase Database) |
| DataWorks | Full-stack support for the Alibaba Cloud ecosystem, visual development | Cloud-native data lakes and offline analysis |
| AWS Glue | Serverless ETL, seamless integration with the AWS ecosystem | Building data lakes and batch processing on AWS Cloud |
