Dev List Digest for Apache Iceberg, Parquet, Polaris and Arrow: January 6–14, 2026
Source: Dev.to
📚 Get Data Lakehouse Books
- Apache Iceberg: The Definitive Guide
- Apache Polaris: The Definitive Guide
- Architecting an Apache Iceberg Lakehouse
- The Apache Iceberg Digest: Vol. 1
🌐 Lakehouse Community
- Join the Data Lakehouse Community
- Data Lakehouse Blog Roll
- OSS Community Listings
- Dremio Lakehouse Developer Hub
📅 Weekly Community Update (Second week of January 2026)
The second week of January brought continued momentum across Apache Iceberg, Polaris, Arrow, and Parquet as the community transitioned from holiday mode into active development. Highlights include governance discussions, community organizing, and technical proposals that will shape the lakehouse ecosystem throughout 2026.
Apache Iceberg
-
Iceberg‑Spark Community Sync Established
Anurag Mantripragada proposed a dedicated monthly sync for Spark‑Iceberg integration, separate from the main community sync. The idea received immediate support from Anton Okolnychyi and Kevin Liu, and the first Iceberg‑Spark Community Sync was scheduled for January 20 (10‑11 am PT).- Agenda: sort‑order reporting, Spark 4.1 support, and the future of DataFusion‑Comet integration.
- Details: mail archive link
-
Project Blog Launch Vote Passes
Kevin Liu called a formal vote to establish an official Apache Iceberg blog aticeberg.apache.org/blogs/. The vote passed with multiple binding and non‑binding +1s (e.g., Russell Spitzer, Steven Wu).- First post will promote the Iceberg Summit 2026.
- Details: mail archive link
-
OAuth2 Manager v2 Proposal Discussion
The OAuth2 Manager v2 design document is being refined. Christian Thiel questioned whether legacy token‑exchange behavior—deprecated for 1.5 years—needs migration.- Decision meeting: January 14 catalog sync.
- Details: mail archive link
-
Summit CFP Reminder
The January 18 deadline for Call‑for‑Papers approaches.- Robin Moffatt asked about the selection‑committee composition.
- Jean‑Baptiste Onofré confirmed Russell Spitzer as the main PMC contact and noted that committee affiliations will be listed in the final proposal.
Apache Polaris
-
Graduation Momentum
Regular community syncs and development sprints continued in early January. The expanding PPMC reflects healthy governance maturation.- The Generic Table capability (cataloging external formats like Apache Hudi and Delta Lake) is slated to graduate from beta in the upcoming release.
-
Integration‑Testing Expansion
With AWS credits now available, contributors discussed expanding integration testing against real cloud infrastructure—especially IAM AssumeRole flows and credential‑vending scenarios that are hard to simulate locally. This investment will improve production‑readiness validation.
Apache Arrow
-
Leadership Continuity Confirmed
Antoine Pitrou, Arrow’s co‑creator, was formally appointed PMC Chair, reinforcing governance stability and providing continued technical vision from the project’s founding leadership. -
Format Enhancements Continue
Work progressed on:- Time‑zone support in temporal types.
- Enhanced compute functions.
These incremental updates maintain Arrow’s position as the universal columnar interchange layer for analytics workloads across engines and languages.
Apache Parquet
-
Board Report Draft Circulated
Julien Le Dem shared the draft January board report for community review ahead of the January 14 submission deadline and January 21 board meeting.- Fokko Driesprong reviewed and approved the report, which will cover recent release activity and community‑health metrics.
-
1.17.0 Release Finalized
Following the January 2 vote passage, contributors verified signatures and performed final release validation.- The release drops Java 8 support in favor of Java 11 as the minimum runtime—a significant modernization milestone.
-
FSST Encoding Progress
Design discussions around FSST (Finite State Symbol Table) compression for string and byte‑array encoding advanced. Contributors are exploring efficient sharing of compressed dictionaries across multiple column pages to reduce file size for string‑heavy workloads.
🔄 Cross‑Project Themes
Java Modernization Wave
Both Iceberg and Parquet are elevating their Java requirements (Parquet to Java 11; Iceberg is considering a similar move). This trend reflects a broader push toward modern runtimes, improved performance, and better alignment with the evolving Java ecosystem.
Prepared by the Data Lakehouse Community.
Modernization and Ecosystem Maturity
- Language & Build Updates – Projects are moving to Java 17, enabling modern language features and cleaner dependency management. This coordinated modernization reflects ecosystem maturity and a willingness to drop legacy runtime support.
- Community Infrastructure Investment – From Iceberg’s specialized Spark sync and project blog to Polaris’s expanded testing infrastructure, all projects are investing in community mechanisms that translate technical discussions into practical implementation guidance and improved engagement.
- Format Evolution Balancing Act – While Iceberg explores V4 features and Parquet scopes V3 possibilities, both projects demonstrate a careful balance between innovation and stability, ensuring production users have fully‑featured, stable platforms before introducing breaking changes.
Looking Ahead
- Iceberg Summit CFP closes January 18.
- Parquet Board Report submission due January 14.
- First Iceberg‑Spark community sync on January 20.
- Atlanta Iceberg meetup on January 21, continuing grassroots community‑building efforts that have grown throughout 2025.
As the lakehouse ecosystem matures, these governance, community, and technical foundations position Apache Iceberg, Polaris, Arrow, and Parquet for another year of production‑grade innovation and ecosystem growth.