Super Simple Web Scraping in Java (Jsoup)
Source: Dev.to
Add Jsoup
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.17.2</version>
</dependency>
Create a minimal scraper
In this example we will print all links (text and URL) from a page:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class SimpleScraper {
public static void main(String[] args) throws Exception {
String url = "https://example.com"; // change this
Document doc = Jsoup.connect(url).get();
for (Element link : doc.select("a[href]")) {
System.out.println(link.text() + " -> " + link.absUrl("href"));
}
}
}
That’s it! No models, no JSON, and no extra libraries.
Extra credit
If you want something specific, change the selector. Examples:
- Article titles:
h1,h2,h3 - Product cards:
.product - Price:
.price - Any element by id:
#price
Example: print all h2 titles:
for (Element h : doc.select("h2")) {
System.out.println(h.text());
}
Happy coding!