Graph Visualization Guide

The LineageBridge graph visualization is an interactive directed acyclic graph (DAG) powered by vis.js. This guide covers graph interaction, filtering, search, and layout customization.

Overview

The graph displays lineage as a directed graph with:

Nodes — Resources (topics, connectors, queries, tables)
Edges — Relationships (produces, consumes, transforms, materializes)
Colors — System-based (Confluent blue, Databricks orange, AWS orange, External gray)
Layout — Sugiyama-style DAG layout with minimal edge crossings

Access the graph:

Launch the Streamlit UI: lineage-bridge-ui
Connect and extract lineage
The graph appears in the Lineage Graph tab

Graph Interaction

Mouse Controls

Action	Behavior
Drag	Move nodes around the canvas
Mouse wheel	Zoom in/out
Ctrl/Cmd + wheel	Zoom faster
Click node	Select node, open detail panel
Shift + drag	Region select (multi-select)
Double-click node	Center on node
Click canvas	Deselect all nodes

Keyboard Shortcuts

Key	Action
Esc	Clear selection
Arrow keys	Pan the canvas
+/-	Zoom in/out

Touch Controls (Mobile/Tablet)

Gesture	Behavior
Pinch	Zoom in/out
Two-finger drag	Pan the canvas
Tap node	Select node
Double-tap	Center on node

Node Types

Nodes are color-coded by system and node type:

Confluent Nodes

Type	Icon	Color	Description
Kafka Topic	🔷	Blue	Kafka topic
Connector	🔌	Green	Source or sink connector
ksqlDB Query	🔄	Purple	ksqlDB persistent query
Flink Job	⚡	Orange	Flink SQL statement
Tableflow Table	📊	Teal	Tableflow topic → table mapping
Schema	📝	Gray	Avro/Protobuf/JSON schema
Consumer Group	👥	Indigo	Kafka consumer group
External Dataset	🌐	Blue-gray	External system referenced by connector

Databricks Nodes

Type	Icon	Color	Description
UC Table	🏛	Orange	Unity Catalog table

AWS Nodes

Type	Icon	Color	Description
Glue Table	🗄	Orange	AWS Glue table

Node Labels

Node labels show the display name:

Kafka topics → topic name
Connectors → connector name
UC tables → catalog.schema.table
Glue tables → database.table

Edge Types

Edges represent relationships between nodes:

Type	Style	Color	Description
PRODUCES	Solid	Blue	Source produces data to destination
CONSUMES	Solid	Green	Source consumes data from destination
TRANSFORMS	Solid	Purple	Source transforms destination (ksqlDB/Flink)
MATERIALIZES	Solid	Orange	Source materializes to destination (Tableflow)
HAS_SCHEMA	Dashed	Gray	Topic has schema
MEMBER_OF	Dashed	Indigo	Consumer group member of topic

Edge direction: Arrows point from source to destination.

Example flow:

postgres-source (Connector) → orders (Topic) → order-enrichment (Flink) → orders_enriched (Topic)

Filtering

The sidebar provides multiple filtering options to focus on specific parts of the graph.

Environment Filter

Sidebar → Graph → Filters → Environment

Filter nodes by Confluent Cloud environment:

Environment: All

Or select a specific environment:

Environment: Production (env-abc123)

Only nodes from the selected environment are shown.

Cluster Filter

Sidebar → Graph → Filters → Cluster

Filter nodes by Kafka cluster:

Cluster: All

Or select a specific cluster:

Cluster: lkc-xyz789

Only nodes from the selected cluster are shown.

Node Type Filters

Sidebar → Graph → Filters

Toggle visibility by node type:

☑ Kafka Topics
☑ Connectors
☑ ksqlDB Queries
☑ Flink Jobs
☑ Tableflow Tables
☑ UC Tables
☑ Glue Tables
☑ Schemas
☑ External Datasets
☑ Consumer Groups

Uncheck to hide nodes of that type.

Use cases:

Hide schemas — Reduce clutter when you only care about data flow
Show only topics — Focus on Kafka topology
Show only UC tables — See lakehouse lineage

Search by Name

Sidebar → Graph → Filters → Search

Enter a substring to filter nodes by qualified name:

Search by name: orders

The graph shows only nodes whose qualified name contains "orders":

orders
orders_enriched
pending_orders

Search is case-insensitive and matches anywhere in the name.

Hide Disconnected Nodes

Sidebar → Graph → Filters

☑ Hide disconnected nodes

When enabled, hides nodes with no edges (no upstream or downstream connections).

Use case: Large graphs with many isolated nodes (common after filtering).

Focus Mode

Focus mode isolates a single node and its neighborhood.

How to Use

Click a node in the graph
Sidebar → Graph → Filters
Enable Focus on selected node
Adjust Hop radius (1-5 hops)

The graph now shows only:

The selected node
Nodes within N hops (upstream and downstream)

Example

Selected node: orders (Kafka Topic)

Hop radius: 2

Visible nodes:

1 hop upstream: postgres-source (Connector)
1 hop downstream: order-enrichment (Flink Job)
2 hops downstream: orders_enriched (Topic)
2 hops upstream: (none)

Use cases:

Trace upstream lineage for a table ("where does this data come from?")
Trace downstream impact for a topic ("who consumes this?")
Isolate a single pipeline for debugging

Disable Focus

Uncheck Focus on selected node to show the full graph again.

Multi-Select

Select multiple nodes at once:

Hold Shift
Drag a rectangle around nodes
All nodes in the rectangle are selected
Detail panel shows the first selected node

Use cases:

Group related nodes for visual analysis
Identify clusters of related resources

Layout

DAG Layout

The graph uses a Sugiyama-style hierarchical layout:

Nodes are arranged in layers (levels)
Edges point downward (mostly)
Edge crossings are minimized

The layout is computed server-side using networkx:

from networkx import topological_generations

for level, node_group in enumerate(topological_generations(graph)):
    # Arrange nodes in horizontal rows

Position Persistence

Node positions are saved in browser localStorage (keyed by graph version). Positions persist across:

Browser refreshes
UI restarts
Graph reloads (same topology)

When positions reset:

Graph topology changes (new nodes, removed nodes)
You manually reset layout

Manual Layout

Drag nodes to customize the layout. Positions are saved automatically.

Reset Layout

Sidebar → Graph → Filters

[🔄 Reset Layout]

This clears saved positions and re-computes the DAG layout.

Use when:

You've manually rearranged nodes and want to start over
Graph topology changed and positions are misaligned

Node Detail Panel

Click a node to open the detail panel on the right side.

Panel Contents

──────────────────────────────────
📋 orders
──────────────────────────────────

Type:        Kafka Topic
System:      CONFLUENT
Environment: Production (env-abc123)
Cluster:     lkc-xyz789

Attributes:
  partitions: 6
  replication_factor: 3
  retention_ms: 604800000

Upstream (2):
  → postgres-source (Connector)
  → customer-stream (ksqlDB Query)

Downstream (3):
  → orders-sink (Connector)
  → order-enrichment (Flink Job)
  → analytics-cg (Consumer Group)

🔗 View in Confluent Cloud

[✕ Close]

Sections

Section	Description
Type	Node type (Kafka Topic, Connector, UC Table, etc.)
System	System type (CONFLUENT, DATABRICKS, AWS, EXTERNAL)
Environment	Confluent Cloud environment ID and display name
Cluster	Kafka cluster ID
Attributes	Node-specific metadata (partitions, retention, config)
Upstream	Nodes that feed data to this node
Downstream	Nodes that consume data from this node
Deep Link	URL to view the resource in its native console

Deep Links

Deep links open the resource in its native console:

Node Type	Link Destination
Kafka Topic	Confluent Cloud topic page
Connector	Confluent Cloud connector page
ksqlDB Query	Confluent Cloud ksqlDB editor
Flink Job	Confluent Cloud Flink SQL workspace
UC Table	Databricks table explorer
Glue Table	AWS Glue console
Schema	Schema Registry subject page

Close Panel

Click Close or select another node to dismiss the panel.

Graph Stats

The stats bar above the graph shows:

Nodes: 142    Edges: 238    Node Types: 7    Environments: 2    Clusters: 3    Pipelines: 12

Metric	Description
Nodes	Total node count (filtered)
Edges	Total edge count (filtered)
Node Types	Distinct node types in the graph
Environments	Distinct Confluent Cloud environments
Clusters	Distinct Kafka clusters
Pipelines	End-to-end flows (source connector → topic → sink connector)

Pipelines are computed as paths from source connectors to sink connectors via topics.

Export & Import

Export JSON

Header → Export JSON

Downloads the current graph as JSON:

{
  "nodes": [...],
  "edges": [...]
}

Use cases:

Share graphs with teammates
Archive lineage snapshots
Load graphs in automation/CI/CD

Import JSON

Sidebar → Data → Load Data → Import JSON

Upload a previously exported graph:

[Choose File] lineage_graph.json

The graph loads automatically.

Tips & Tricks

Performance with Large Graphs

500+ nodes? Enable Hide Disconnected Nodes
Too many edges? Filter by node type
Slow rendering? Use environment/cluster filters

Identify Bottlenecks

Filter to show only Kafka Topics
Look for topics with many downstream consumers
Click the topic to see consumer list in detail panel

Trace Data Flow

Click a source connector
Enable Focus on selected node with hop radius 3+
See the full downstream pipeline

Find Unused Resources

Enable Hide disconnected nodes
Toggle off all node types except one (e.g., Topics)
Disconnected nodes = topics with no producers/consumers

Compare Environments

Extract lineage from two environments
Export JSON for each
Load in separate browser tabs
Compare topology visually

Troubleshooting

Graph is empty

Check:

Did extraction succeed? (Check sidebar → Extraction panel)
Are all node type filters enabled? (Sidebar → Graph → Filters)
Is the search box empty?
Are environment/cluster filters set to "All"?

Nodes overlap

Solutions:

Drag nodes manually to separate them
Click Reset Layout to re-compute positions
Enable Hide disconnected nodes to reduce clutter

Edges are hard to follow

Solutions:

Use Focus mode to isolate a single node
Filter by node type to reduce edge count
Click nodes to see upstream/downstream in detail panel

Positions reset after reload

Cause: Graph topology changed (new nodes, removed nodes)

Behavior: Positions persist only for identical graphs. If the set of node IDs changes, layout resets.

Can't find a specific node

Solution: Use Search by name (Sidebar → Graph → Filters → Search)

Graph is slow

Solutions:

Reduce node count via filters
Disable physics (already disabled in LineageBridge)
Use a smaller hop radius in Focus mode

Next Steps

Learn UI features: See Streamlit UI Guide
Automate extraction: See CLI Tools Reference
Set up auto-update: See Change Detection Guide