Extract Structured Data from Car Listings Using AI in .NET 10
Source: Dev.to
Why This Matters
Car listings come in all shapes and sizes. Whether you’re building a price‑comparison site, marketplace aggregator, or inventory‑management system, you need to:
- Extract key details (make, model, year, mileage, price)
- Handle different formats (sale, lease, rent)
- Deal with missing information gracefully
- Process data at scale
Manually parsing this is tedious. Let AI do the heavy lifting! 💪
Tools We’ll Use
- GitHub Models – free access to powerful AI models (no credit card needed)
- Microsoft.Extensions.AI – unified AI abstraction for .NET
- .NET 10 – latest and greatest
Setup
Create a new console app
dotnet new console -n TextExtraction cd TextExtractionAdd required packages
dotnet add package Microsoft.Extensions.AI.OpenAI dotnet add package Microsoft.Extensions.Configuration.UserSecrets
Store your GitHub token securely
dotnet user-secrets init
dotnet user-secrets set "GitHubModels:Token" "your-github-token"Define the Extraction Model
Create CarDetails.cs:
using System.Text.Json.Serialization;
[JsonConverter(typeof(JsonStringEnumConverter))]
public enum AvailabilityType
{
Sale,
Lease,
Rent
}
public class CarDetails
{
public string Make { get; set; } = string.Empty;
public string Model { get; set; } = string.Empty;
public int? Year { get; set; }
public double? Mileage { get; set; }
public double? Price { get; set; }
public AvailabilityType? AvailabilityType { get; set; }
public double? PricePerMonth { get; set; }
public double? PricePerDay { get; set; }
public string[]? Features { get; set; }
public string? Location { get; set; }
public string ShortSummary { get; set; } = string.Empty;
public int? OwnerCount { get; set; }
}Note: The nullable types let us handle missing data elegantly! ✨
Core Extraction Logic (Program.cs)
using Microsoft.Extensions.AI;
using OpenAI;
using System.ClientModel;
using System.Text.Json;
// -------------------------------------------------
// 1️⃣ Configure the client
// -------------------------------------------------
var configuration = new ConfigurationBuilder()
.AddUserSecrets()
.Build();
var credential = new ApiKeyCredential(
configuration["GitHubModels:Token"]
?? throw new InvalidOperationException("Token not found")
);
IChatClient chatClient = new OpenAIClient(credential, new OpenAIClientOptions
{
Endpoint = new Uri("https://models.inference.ai.azure.com")
})
.GetChatClient("gpt-4o-mini")
.AsIChatClient();
// -------------------------------------------------
// 2️⃣ Prompt (schema)
// -------------------------------------------------
var prompt = @"Extract the following details from the car listing and return **ONLY** a valid JSON object:
{
""Make"": ""string - car manufacturer/brand"",
""Model"": ""string - car model name"",
""Year"": number - manufacturing year,
""Mileage"": number - kilometers driven,
""Price"": number - price in lakhs,
""AvailabilityType"": ""string - one of: Sale, Lease, Rent"",
""Features"": ""array of strings - notable features"",
""ShortSummary"": ""string - brief summary in 10‑15 words"",
""OwnerCount"": number - previous owners (null if not mentioned)
}
Return only the JSON object, no additional text.";
// -------------------------------------------------
// 3️⃣ Sample listings
// -------------------------------------------------
var carListings = new List<string>
{
"Honda City 2018 for sale, only 30,000 km! Single owner, showroom condition. ₹6.5 lakh.",
"Hyundai Creta SX 2020 — premium SUV with sunroof. Monthly lease at ₹22,000.",
"Toyota Innova Crysta 2019 — spacious 7‑seater, 40,000 km, rent at ₹2,500/day."
};
// -------------------------------------------------
// 4️⃣ Process each listing
// -------------------------------------------------
foreach (var listing in carListings)
{
var response = await chatClient.GetResponseAsync(
$"{prompt}\n\nCar Listing:\n{listing}"
);
// The model should deserialize the JSON into this POCO:
// (Define `CarDetails` elsewhere in the project.)
if (response.TryGetResult(out CarDetails? carDetails) && carDetails != null)
{
Console.WriteLine($"✅ Extracted: {carDetails.Make} {carDetails.Model}");
Console.WriteLine(
JsonSerializer.Serialize(
carDetails,
new JsonSerializerOptions { WriteIndented = true }
)
);
}
}Run the app
dotnet runSample Output
Processing car listings...
✅ Extracted: Honda City
{
"Make": "Honda",
"Model": "City",
"Year": 2018,
"Mileage": 30000,
"Price": 6.5,
"AvailabilityType": "Sale",
"Features": [
"Single owner",
"Showroom condition"
],
"OwnerCount": 1
}
✅ Extracted: Hyundai Creta
{
"Make": "Hyundai",
"Model": "Creta SX",
"Year": 2020,
"AvailabilityType": "Lease",
"PricePerMonth": 22000,
"Features": [
"Premium SUV",
"Sunroof"
]
}Extending the Solution
Add More Fields – fuel type, transmission, color
public string? FuelType { get; set; } // Petrol / Diesel / Electric public string? Transmission { get; set; } // Manual / Automatic public string? Color { get; set; }Process Real‑Time Data – pull listings from an API or RSS feed
var listings = await FetchListingsFromApi("https://api.carmarket.com/listings");Validate Data
if (carDetails.Year > DateTime.Now.Year) { Console.WriteLine("⚠️ Invalid year detected"); }Persist to a Database
await dbContext.CarListings.AddAsync(carDetails); await dbContext.SaveChangesAsync();Swap to a More Capable Model
var client = chatService.GetChatClient("gpt-4o"); // Higher accuracy, slightly slower
Best Practices
- Keep temperature low – the default setting usually yields the most consistent extraction.
- Be explicit in prompts – specify the exact JSON structure you need.
- Use nullable types – not every listing contains every field.
- Batch process – handle many listings efficiently.
- Monitor token usage – track costs via
response.Usage.
🎯 Ready to turn chaotic car ads into clean, structured data?
Give it a try, tweak the schema to your needs, and let AI do the heavy lifting! 🚀
World Applications
🚀 Use‑Cases
- 🏪 Marketplace Aggregation – Consolidate listings from multiple sources
- 💰 Price Intelligence – Track pricing trends across markets
- 📊 Analytics Dashboards – Build insights from unstructured data
- 🤖 Chatbots – Power car‑recommendation bots
- 📱 Mobile Apps – Parse user‑submitted listings
Get the Complete Working Example
Grab it from GitHub:
genai-dotnet-basic_llm_tasks/TextExtractionThe repo includes
- ✅ Full source code with comments
- ✅ 9 example car listings
- ✅ Configuration‑setup guide
- ✅ Detailed README
What You’ll Learn
- Using GitHub Models API in .NET
- Strongly‑typed AI responses with
GetResponseAsync - Schema‑based extraction with AI
- Handling unstructured data gracefully
- Building production‑ready text extraction
Try Extracting
- 📄 Resume data – name, skills, experience
- 🧾 Invoices – vendor, amounts, dates
- 📧 Emails – sender, subject, key points
- 🏠 Real‑estate listings
- 🍕 Restaurant menus – dishes, prices, ingredients
The same pattern works for any text‑extraction task!
What Will You Build?
Drop a comment below! 👇
👍 Like This?
Found this helpful? Give it a ❤️ and follow for more .NET + AI content!
Tags: #dotnet #ai #machinelearning #csharp #github #opensource #textextraction #nlp #automation
GitHub Repo: https://github.com/your‑org/genai-dotnet-basic_llm_tasks/tree/main/TextExtraction