Quickstart Guide

Get your first lineage graph running in 30 seconds with our one-line quickstart, or follow the full walkthrough to connect to your Confluent Cloud environment.

🚀 Fastest Start (Demo Mode)

Zero setup, zero credentials, zero waiting:

curl -fsSL https://raw.githubusercontent.com/takabayashi/lineage-bridge/main/scripts/quickstart.sh | bash

This automatically:

✓ Installs dependencies (uv + LineageBridge)
✓ Launches the UI with a sample lineage graph
✓ Opens your browser to http://localhost:8501

Or use Docker:

docker run -p 8501:8501 ghcr.io/takabayashi/lineage-bridge:latest

Then open http://localhost:8501 and click Load Demo Graph.

No Confluent Account Needed

Demo mode uses a pre-built sample graph. Perfect for:

Evaluating LineageBridge before connecting to Confluent
Learning the UI and graph navigation
Sharing demos without exposing credentials

Ready to connect to your real Confluent environment? Continue below.

📋 Connecting to Confluent Cloud

Ready to extract lineage from your real environment? You have two paths:

Path A: Guided Setup (Recommended)

The welcome dialog appears automatically when you launch LineageBridge without credentials:

graph LR
    A[Run quickstart] --> B[UI opens]
    B --> C[Welcome dialog appears]
    C --> D{Choose option}
    D -->|Save & Connect| E[Enter credentials]
    D -->|Skip for Now| F[Add later via sidebar]
    D -->|Load Demo| G[Explore sample graph]
    E --> H[Auto-saved to .env]
    H --> I[Discover environments]
    I --> J[Extract lineage]

    style A fill:#e1f5ff
    style E fill:#90ee90
    style H fill:#90ee90
    style I fill:#90ee90
    style J fill:#90ee90

Steps:

Run the quickstart (if you haven't already)
Welcome dialog appears
Click "Save & Connect"
Enter your Cloud API Key
Credentials are saved automatically
Start extracting!

No .env Editing Required

The welcome dialog handles everything—no manual configuration needed!

Path B: Manual Installation

For development or when you prefer manual control:

# Clone and install
git clone https://github.com/takabayashi/lineage-bridge.git
cd lineage-bridge
make install

# Launch UI
make ui

The welcome dialog will appear to guide credential setup. Or manually create .env:

LINEAGE_BRIDGE_CONFLUENT_CLOUD_API_KEY=your-cloud-api-key
LINEAGE_BRIDGE_CONFLUENT_CLOUD_API_SECRET=your-cloud-api-secret

Where Do I Get API Keys?

Create them in the Confluent Cloud Console or via CLI:

confluent api-key create --resource cloud

Optional: Add data catalog credentials for Unity Catalog, AWS Glue, or Google Data Lineage integration.

🎯 Your First Extraction

Once credentials are configured (via welcome dialog or .env):

Step 1: Discover Your Environment

The UI automatically discovers all your Confluent Cloud environments and clusters:

    LINEAGE_BRIDGE_GCP_PROJECT_ID=my-gcp-project
    LINEAGE_BRIDGE_GCP_LOCATION=us-central1
    ```

    Then authenticate: `gcloud auth application-default login`

## Step 3: Launch the UI

Time to fire up the Streamlit interface:

=== "Make (Easiest)"

    ```bash
    make ui
    ```

=== "uv"

    ```bash
    uv run streamlit run lineage_bridge/ui/app.py
    ```

=== "Direct"

    ```bash
    streamlit run lineage_bridge/ui/app.py
    ```

Your browser should automatically open to [http://localhost:8501](http://localhost:8501).

!!! success "What You'll See"
    On first launch, you'll see the LineageBridge welcome screen showing your connection status. If everything's green, you're ready to extract!

## Step 4: Extract Lineage

Now for the fun part—let's extract your lineage!

### Select Your Environment and Cluster

Look at the sidebar:

1. Find the **Environment** dropdown and select your Confluent Cloud environment
2. Select a **Cluster** from the dropdown that appears
3. All done—you're ready to extract!

### What Gets Extracted?

The sidebar shows all available extractors (all enabled by default):

| Extractor | What It Does |
|-----------|--------------|
| **Kafka Topics & Consumer Groups** | Core topic inventory and consumption patterns |
| **Connect** | Source and sink connectors (where data enters/exits Kafka) |
| **ksqlDB** | Persistent queries and stream transformations |
| **Flink** | Flink SQL jobs and transformations |
| **Schema Registry** | Schema definitions and associations |
| **Stream Catalog** | Tags and business metadata |
| **Tableflow** | Topic-to-table mappings (links to data catalogs) |
| **Metrics** | Throughput and lag metrics (slowest, usually skip for first run) |

For your first run, keep them all enabled to see the full picture.

### Run the Extraction

1. Click the big **Extract Lineage** button at the bottom of the sidebar
2. Watch the progress indicators—each extractor reports its status
3. Wait for it to complete (typically 10-30 seconds for a small cluster)

!!! tip "What's Happening Behind the Scenes"
    LineageBridge is calling Confluent Cloud APIs to discover topics, connectors, schemas, and transformations. Then it stitches everything together into a graph showing how data flows through your system.

### View Your Graph

Once extraction completes, you'll see:

```mermaid
graph LR
    A[Source Connector] -->|produces to| B[orders_topic]
    B -->|consumed by| C[ksqlDB Query]
    C -->|produces to| D[enriched_orders]
    D -->|materialized to| E[UC Table]

    style A fill:#4a90e2
    style B fill:#4a90e2
    style C fill:#4a90e2
    style D fill:#4a90e2
    style E fill:#ff9500

An interactive graph visualization in the main panel
Node colors showing different systems (Confluent = blue, Databricks = orange, AWS = yellow)
Edges showing data flow relationships (who produces to whom)

Explore the Graph

Try these interactions:

Drag nodes around to rearrange the layout
Scroll to zoom in and out
Click a node to see full details in the right panel
Shift+drag to select multiple nodes
Search by name in the sidebar search box

Step 5: Inspect Lineage Details

Click on any node to see its full details in the right panel.

Kafka Topic

Click a topic to see:

OverviewAttributesLineageDeep Link

Type: KAFKA_TOPIC
Cluster: Which cluster it lives in
Environment: Confluent Cloud environment ID

Partition count
Replication factor
Retention policy
Cleanup policy

Incoming Edges: What produces to this topic (connectors, Flink jobs, applications)
Outgoing Edges: What consumes from this topic (consumer groups, ksqlDB, connectors)

Click View in Confluent Cloud to open this topic in the Confluent Cloud UI.

Connector

Click a connector to see:

Type: CONNECTOR (source or sink)
Class: The connector plugin (e.g., PostgresSource, S3Sink)
Status: Running, paused, or failed
Configuration: All connector settings
Connected Topics: Which topics it reads from or writes to

Transformation (ksqlDB or Flink)

Click a transformation to see:

SQL Statement: The actual query creating the transformation
Input Topics: Where the data comes from
Output Topics: Where the transformed data goes
Type: Whether it's a ksqlDB persistent query or Flink SQL statement

Export Your Graph

Want to save or share your lineage? Easy:

Click the Export Graph button in the sidebar
The graph saves as lineage_graph.json in your project directory
Share this file with teammates or load it later for offline viewing

The JSON export includes:

All nodes with full metadata
All edges showing relationships
Extraction timestamp
Source environment and cluster info

Jump to Confluent Cloud

Every node includes a deep link to its resource in Confluent Cloud:

Click any node in the graph
In the detail panel, find the View in Confluent Cloud link
Click it—your browser opens directly to that resource

This is super handy when you spot something interesting in the lineage and want to dig deeper in the Confluent UI.

Next Steps

Congratulations! You've extracted and visualized your first lineage graph. Here's what to explore next:

Learn More About the UI

Streamlit UI Guide → - Full UI features and controls
Graph Visualization → - Advanced graph interactions
Change Detection → - Auto-refresh on changes

Use CLI Tools

Try the command-line extraction:

# Extract and save to JSON
uv run lineage-bridge-extract

# Run the change-detection watcher
uv run lineage-bridge-watch

See CLI Tools Guide → for details.

Integrate with Data Catalogs

Connect your lineage to external catalogs:

Automate with Docker

Deploy LineageBridge as a service:

# Run UI in Docker
make docker-ui

# Run extraction in Docker
make docker-extract

# Run watcher in Docker
make docker-watch

See Docker Deployment → for details.

Explore the REST API

LineageBridge includes a REST API for programmatic access:

# Start the API server
make api

Then visit http://localhost:8000/docs for interactive API documentation.

See API Reference → for details.

Common Scenarios

Scenario 1: Multi-Cluster Lineage

If you have multiple clusters:

Extract lineage from the first cluster (select in UI)
Change the cluster dropdown to the next cluster
Click Extract Lineage again
The graph now shows combined lineage from both clusters

Scenario 2: Focus on a Specific Topic

To explore lineage for a single topic:

Use the Search box in the sidebar
Type the topic name (e.g., orders)
Click the topic in search results
The graph highlights the topic and its neighbors
Click the topic to see full details

Scenario 3: Finding Data Flow Paths

To trace data from source to destination:

Find your source connector in the graph
Follow the edges to see which topic it writes to
From that topic, see what consumes (ksqlDB, Flink, or sink connector)
Continue following edges to trace the full pipeline

Troubleshooting

Running into issues? Here are the most common problems and how to fix them.

No Graph Appears

If extraction completes but you see a blank graph:

Check the extraction log in the sidebar—look for red error messages
Verify you selected the correct environment and cluster
Make sure your API key has read permissions (not write-only)
Try extracting with just the Kafka Topics extractor first

Authentication Errors

Getting "401 Unauthorized" errors?

Double-check your API key and secret in .env (no typos!)
Verify the key hasn't expired (check in Confluent Cloud)
Make sure you're using a cloud-level key, not a cluster-scoped key
Test the credentials in Confluent Cloud UI first

Empty Graph

Graph loads but has zero nodes?

Verify your cluster actually has topics (check in Confluent Cloud UI)
Try running just the Kafka Topics extractor to isolate the issue
Check the logs for API errors or permissions issues
Make sure you selected the right environment and cluster

Extraction is Slow

Taking forever to extract?

Disable the Metrics extractor—it's by far the slowest
Extract one cluster at a time instead of multiple
For large clusters (100+ topics), extraction can take a few minutes—this is normal
Check your internet connection speed

Common Mistakes

Wrong API Key Scope

The most common mistake: using a cluster-scoped API key instead of a cloud-level key. LineageBridge needs cloud-level access to discover environments and services.

Missing .env File

Make sure your .env file is in the project root directory (same level as pyproject.toml), not in a subdirectory.

Still Stuck?

Check the full Troubleshooting Guide or open an issue on GitHub with:

Your error messages (redact API keys!)
LineageBridge version (python -c "from lineage_bridge import __version__; print(__version__)")
What you were trying to do when it failed

What's Next?

You've completed the quickstart! You now have a working LineageBridge installation and understand the basics of extraction and visualization.

To go deeper:

User Guide - Comprehensive feature documentation
Architecture - How LineageBridge works under the hood
How-To Guides - Recipes for common tasks
Contributing - Help improve LineageBridge

Get Help

If you run into issues:

Check the Troubleshooting Guide
Search GitHub Issues
Open a new issue with your error logs and configuration (redact secrets!)

Happy lineage mapping!