Add databricks support #737

Ewan-Keith · 2025-08-14T19:43:31Z

firstly, thanks for the tool!

This PR adds support for databricks sql based on othe ther implementations. DBx does have support for primary/foreign keys, although they're not enforced. As they're still extremely useful for documentation I've pulled them through via the driver.

Some unit testing is included but as a cloud platform databricks doesn't make integration testing hugely easy. I have copied the snowflake TPC-H SF1 DDL in ./testdata to use DBx SQL syntax and have used this to carry out manual testing. The output of this testing is stored at sample/databricks. Databricks has recently released a free version which can be spun up with just an email (no payment details or cloud infra required) so if the maintainers wanted to it should be relatively straightforward to setup e2e testing.

2 auth mechanisms are supported, depending on whether a user or service principal is being used to execute the commands. This means that the DSN Ive landed on for tbls use isn't exactly the same as the internal dsn used by the dbx go library but this felt like an acceptable tradeoff for a clearer interface.

finally, while local testing I wanted to avoid accidentally committing my own dbx creds so have added tbls.yml to the .gitignore file. I don't know if that's desirable or not, feel free to remove it if not.

I think I've got everywhere that needs updating for a new database updated, but just point me to any I've missed!

- Convert Snowflake TPC-H DDL to Databricks SQL syntax - Add primary keys and foreign key constraints for all 8 tables - Include detailed table and column descriptions based on TPC-H specification - Proper dependency ordering for constraint creation - Support for testing Databricks driver integration Tables: REGION, NATION, PART, SUPPLIER, CUSTOMER, PARTSUPP, ORDERS, LINEITEM Total: 65 columns with full business context documentation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…onships - Add Databricks SQL driver integration to main.go with databricks-sql-go dependency - Create dedicated Databricks driver in drivers/databricks/ with full schema analysis - Implement AnalyzeDatabricks() function for proper DSN handling without dburl parsing - Support tables, views, columns, constraints, and complete foreign key relationships - Use REFERENTIAL_CONSTRAINTS and KEY_COLUMN_USAGE system tables for accurate FK mapping - Handle Databricks three-level naming (catalog.schema.table) and query parameters - Provide graceful fallback for information schema features that may not be available - Follow existing driver patterns and conventions for consistency with other databases 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Resolves issues where table-specific markdown files and schema.json output were missing foreign key relationships and constraint metadata. The driver now properly populates: - Column ParentRelations/ChildRelations for foreign key relationships - Constraint.Columns field with column names from key_column_usage - Constraint.ReferencedTable/ReferencedColumns for FK constraints - Complete constraint definitions with proper SQL formatting This ensures that: - Table markdown files display relationships correctly in relations sections - Third-party tools using schema.json have complete relationship data - Foreign key constraints show proper column mappings and references 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…mprehensive tests Major improvements to the Databricks driver constraint handling: - Replace N+1 queries with single SQL aggregation query using COLLECT_LIST() - Remove intermediate constraintData struct for cleaner code flow - Add parseArrayString() helper to handle Databricks array string format - Implement comprehensive unit tests (32 test cases total): - TestParseArrayString: 10 test cases covering array parsing edge cases - TestBuildConstraintDefinition: 22 test cases covering all constraint types - Apply standard Go formatting with go fmt Performance improvements: - Single query per table instead of 1 + N queries for constraints - Direct SQL aggregation eliminates application-side grouping logic - Same exact functionality with significantly improved performance 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…oauth support

k1LoW

@Ewan-Keith GREAT WORK!!

Thank you!!

I have two minor requests for revision. Please consider them.

k1LoW · 2025-08-30T02:53:37Z

README.md


+**Databricks:**
+
+**Personal Access Token (PAT) Authentication:**


To ensure consistency with other data sources, I would appreciate it if you could either discontinue the use of bold formatting or add explanatory comments within the code blocks.

README.md

Co-authored-by: Ken’ichiro Oyama <[email protected]>

Ewan-Keith · 2025-09-01T16:09:42Z

Thanks! both formatting changes made 👍

k1LoW

Looks GREAT!! Thank you!!

Ewan-Keith and others added 14 commits August 8, 2025 13:47

removing some comments and removing ping test on dbx impl

94784f1

refactored ot databricks function

8c24155

more refactoring of databricks function

614a7a2

simplified column fetching aggregation

34b1432

optimized databricks constraint fetching performance

64588db

add sample databricks output

6ecbb58

add databricks doc section in readme

ba51157

moved pat token to query param for databricks dsn in prep for adding …

f7cc3d7

…oauth support

add oauth support to databricks driver for m2m auth

c0a06a6

fix readme example

50f623f

k1LoW reviewed Aug 30, 2025

View reviewed changes

Ewan-Keith and others added 2 commits September 1, 2025 17:07

mark databricks support as experimental

800bfbf

Co-authored-by: Ken’ichiro Oyama <[email protected]>

Fix formatting in Databricks authentication section

4e025ca

k1LoW approved these changes Sep 9, 2025

View reviewed changes

k1LoW added enhancement New feature or request minor labels Sep 9, 2025

k1LoW merged commit ef9332c into k1LoW:main Sep 9, 2025
3 checks passed

github-actions bot mentioned this pull request Sep 9, 2025

Release for v1.88.0 #741

Merged

BrewTestBot mentioned this pull request Sep 12, 2025

tbls 1.88.0 Homebrew/homebrew-core#239491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add databricks support #737

Add databricks support #737

Uh oh!

Ewan-Keith commented Aug 14, 2025

Uh oh!

k1LoW left a comment

Uh oh!

k1LoW Aug 30, 2025

Uh oh!

Uh oh!

Ewan-Keith commented Sep 1, 2025

Uh oh!

k1LoW left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Databricks:

		Personal Access Token (PAT) Authentication:

Add databricks support #737

Add databricks support #737

Uh oh!

Conversation

Ewan-Keith commented Aug 14, 2025

Uh oh!

k1LoW left a comment

Choose a reason for hiding this comment

Uh oh!

k1LoW Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Ewan-Keith commented Sep 1, 2025

Uh oh!

k1LoW left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants