Microsoft Deletes Blog Telling Users To Train AI on Pirated Harry Potter Books

Published: (February 20, 2026 at 04:20 PM EST)
1 min read
Source: Slashdot

Source: Slashdot

Incident Overview

Microsoft pulled a year‑old blog post this week after a Hacker News thread flagged that it had encouraged developers to download all seven Harry Potter books from a Kaggle dataset—incorrectly marked as public domain—and use them to train AI models on the company’s Azure platform.

The blog, written in November 2024 by senior product manager Pooja Kamath, walked users through building Q&A systems and generating fan fiction using the copyrighted texts, and even included a Microsoft‑branded AI image of Harry Potter. The Kaggle dataset’s uploader, data scientist Shubham Maindola, told Ars Technica that the public domain label was “a mistake” and deleted the dataset after the outlet reached out.

0 views
Back to Blog

Related posts

Read more »