Contributing to LineageBridge
Thank you for your interest in contributing to LineageBridge! We welcome contributions of all kinds - bug reports, feature requests, documentation improvements, and code contributions.
Quick Start
- Read the Development Setup guide
- Check the Testing guide for running tests
- If adding new extractors, see Adding Extractors
- Submit your PR following the guidelines below
Code of Conduct
We are committed to providing a welcoming and inclusive environment. Please be respectful and constructive in all interactions.
How to Contribute
Reporting Bugs
If you find a bug, please open a GitHub issue with:
- Clear title: Describe the issue concisely
- Expected behavior: What you expected to happen
- Actual behavior: What actually happened
- Reproduction steps: Step-by-step instructions to reproduce the issue
- Environment details: Python version, OS, LineageBridge version
- Logs: Any relevant error messages or stack traces
Example:
Title: FlinkClient fails to parse TUMBLE windowing functions
Expected: Flink SQL with TUMBLE(TABLE ...) should extract table as source
Actual: Parser raises IndexError on TUMBLE clause
Steps: Run extraction on environment env-xyz with Flink statement flink-abc123
Environment: Python 3.11.5, macOS 14.2, LineageBridge 0.4.0
Logs: [paste stack trace]
Requesting Features
Feature requests are welcome! Please open an issue describing:
- Use case: What problem does this solve?
- Proposed solution: How should it work?
- Alternatives considered: Other approaches you've thought about
- Impact: Who would benefit from this feature?
Contributing Code
We follow a standard fork-and-pull-request workflow:
- Fork the repository on GitHub
- Clone your fork locally:
- Create a feature branch from
main: - Make your changes and commit with clear messages:
- Run tests and linting before pushing:
- Push to your fork:
- Open a Pull Request on GitHub
Pull Request Guidelines
Before Submitting
- Run tests:
make testoruv run pytest tests/ -v - Check linting:
make lintoruv run ruff check . - Format code:
make formatoruv run ruff format . - Update documentation if adding features or changing behavior
- Add tests for new functionality
PR Description
Your PR description should include:
- What changed: Brief summary of the changes
- Why: Rationale and motivation
- How: Implementation approach (for complex changes)
- Testing: How you tested the changes
- Breaking changes: Note any API changes or migrations needed
Example:
## What
Add support for Google BigQuery as a catalog provider
## Why
Users want to materialize Kafka topics to BigQuery via Dataflow
and track lineage in BigQuery's native metadata system.
## How
- Implemented BigQueryProvider following the CatalogProvider protocol
- Added BigQuery client using google-cloud-bigquery SDK
- Set table descriptions and labels via BigQuery API
## Testing
- Unit tests with mocked BigQuery client
- Integration test against real BigQuery project (manual)
- Added fixture data in tests/fixtures/bigquery_tables.json
## Breaking Changes
None - purely additive.
Review Process
- All PRs require review before merging
- Reviewers may request changes - please respond constructively
- Once approved, a maintainer will merge your PR
- Keep PRs focused - one feature or fix per PR
Areas Needing Contribution
Here are some areas where we'd especially welcome contributions:
New Catalog Providers
- Google BigQuery: Materialize to BigQuery via Dataflow
- Snowflake: Support Snowpipe streaming
- Azure Data Explorer: Kafka Connect integration
- Iceberg REST Catalog: Direct lineage to Iceberg tables
See Adding Extractors for guidance.
Enhanced Extractors
- Schema Registry: Subject references, schema evolution tracking
- ksqlDB: Stream-table joins, windowing functions
- Flink: Complex window patterns, catalog integrations
- Tableflow: Multi-topic joins, transformation tracking
UI Improvements
- Graph layout algorithms (force-directed, hierarchical)
- Advanced filtering (by tags, metadata, date ranges)
- Export formats (SVG, PNG, Mermaid diagram)
- Diff view for lineage changes over time
Testing & Infrastructure
- Integration test coverage for all extractors
- Performance benchmarks for large graphs
- Docker Compose local development setup
- CI/CD pipeline improvements
Development Workflow
Setting Up Your Environment
See Development Setup for detailed instructions.
Quick version:
# Install dependencies
uv pip install -e ".[dev]"
# Run tests
make test
# Run UI locally
make ui
# Run extraction
make extract
Coding Standards
- Python 3.11+ required
- Type hints on all public functions
- Docstrings for public modules, classes, functions
- Ruff for linting and formatting (line length: 100 chars)
- pytest for tests (asyncio_mode = "auto")
Commit Messages
Write clear, descriptive commit messages:
- Use present tense: "Add feature" not "Added feature"
- Keep first line under 72 characters
- Add details in the body if needed
Good:
Add HOP windowing support to FlinkClient
Extends the Flink SQL parser to recognize HOP window functions
and extract the source table from HOP(TABLE <name>, ...).
Avoid:
Community & Support
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: Questions and general discussion
- Pull Requests: Code contributions
License
By contributing to LineageBridge, you agree that your contributions will be licensed under the Apache License 2.0.
Ready to contribute? Start with the Development Setup guide, or browse open issues tagged with good-first-issue on GitHub.