Continuous Journey through Dagster - bugs and testing
Source: Dev.to
My Recent Contributions
Fixing ECS Pipes Client Execution
- Issue: Users encountered an
IndexErrorwhen launching tasks using thePipesECSClient, causing pipelines to crash in ECS environments. - Fix: Added proper exception handling and bounds checking to ensure the client launches tasks smoothly without crashing on index errors.
- [Issue #32936]
Resolving Asset Specs Mapping Dependencies
- Issue: Logic error in
AssetsDefinition.map_asset_specscaused failures when adding dependencies while input definitions were already set. - Fix: Adjusted core logic to correctly handle mapping of asset specs even when inputs are pre‑configured.
- [Issue #32913]
[WIP] Correcting Asset Sensor Event Processing
- Issue: The
asset_sensoronly processed the last materialization event when multiple partitions materialized simultaneously due to a race condition. - Current Work: Modified sensor logic to capture and process every materialization event regardless of concurrency. Precise approach with careful testing is required.
- [Issue #32853]
[WIP] Implementing Merge Support for Polars & Delta Lake
- Use Case:
dagster-deltalakeI/O manager supports writing data but lacks merge operation support when using Polars. - Implementation: Updating
dagster_deltalake/handler.pyto support merge mode. When write mode is set to merge, aDeltaTableobject is created and the merge operation is executed instead of the standardwrite_deltalake()call. - [Issue #32644]
CI Issues
Situation:
- Unit tests pass locally.
- After pushing to GitHub, the CI pipeline fails, preventing proper code review.
Possible causes include environment configuration mismatches, stricter linting rules in CI, or hidden dependency issues.
Next Steps
- Seek guidance from the Dagster team and community.
- Understand differences between the CI environment and a standard local setup to replicate and fix the failures.
- Continue reading code and fixing errors, as sometimes that is more efficient than exhaustive testing.