large language models — Page 4

Sort:

2 weeks ago · ai · - · -

[Paper] AREG: Adversarial Resource Extraction Game for Evaluating Persuasion and Resistance in Large Language Models

Evaluating the social intelligence of Large Language Models (LLMs) increasingly requires moving beyond static text generation toward dynamic, adversarial intera...

#large-language-models #adversarial-benchmark #persuasion-resistance #LLM-evaluation #nlp
2 weeks ago · ai · - · -

Indian AI lab Sarvam’s new models are a major bet on the viability of open-source AI

Indian AI lab Sarvamhttps://www.sarvam.ai/ unveiled a new generation of large language models, betting that smaller, efficient open‑source AI models can capture...

#open-source AI #large language models #India AI #multilingual models #AI startups
2 weeks ago · ai · - · -

Personalization features can make LLMs more agreeable

Overview Many of the latest large language models LLMs are designed to remember details from past conversations or store user profiles, enabling these models t...

#large language models #personalization #sycophancy #model alignment #user profiling #MIT research
2 weeks ago · software · - · -

[Paper] Algorithm-Based Pipeline for Reliable and Intent-Preserving Code Translation with LLMs

Code translation, the automatic conversion of programs between languages, is a growing use case for Large Language Models (LLMs). However, direct one-shot trans...

#code translation #large language models #algorithmic specification #software engineering #LLM evaluation
2 weeks ago · ai · - · -

I Tried to Trick 7 AI Models with Fake Facts. They Didn't Fall for It. (That's a Problem.)

Overview I spent a weekend testing whether large language models would confidently repeat misinformation back to me. I fed them 20 fake historical facts alongs...

#large language models #hallucination #fact-checking #benchmark #model evaluation
2 weeks ago · ai · - · -

[Paper] Decision Quality Evaluation Framework at Pinterest

Online platforms require robust systems to enforce content safety policies at scale. A critical component of these systems is the ability to evaluate the qualit...

#content moderation #large language models #evaluation framework #golden dataset
2 weeks ago · ai · - · -

How My Team Aligns on Prompting for Production

My team at Google is automating sample code generation and maintenance. Part of that is using Generative AI to produce and assess instructional code. This intro...

#prompt engineering #large language models #generative AI #software development #team collaboration
2 weeks ago · ai · - · -

Choosing the Right Model: GPT vs Claude vs Local (A Practical Decision Tree)

Introduction Choosing a model is primarily an economics + risk decision. Defaulting to the “best” model for every task quickly burns money. Below is a practica...

#large-language-models #model-selection #cost-optimization #LLM-decision-tree
2 weeks ago · ai · - · -

As AI jitters rattle IT stocks, Infosys partners with Anthropic to build ‘enterprise-grade’ AI agents

Indian IT giant Infosys announced a partnership with Anthropic to develop enterprise‑grade AI agents. The collaboration will integrate Anthropic’s Claude models...

#Infosys #Anthropic #enterprise AI agents #large language models #AI services
2 weeks ago · ai · - · -

Why Most AI Agents Are Still Glorified Chatbots (And What Actually Works)

The AI agent hype is real. Everyone's building them, everyone's talking about them, and most of them are trash. I've been watching this space closely, and here'...

#AI agents #chatbots #large language models #function calling #agent architecture #AI hype
2 weeks ago · ai · - · -

You Are a (Mostly) Helpful Assistant

When helpfulness becomes a problem Imagine having your prime directive, your entire purpose of being, your mission and lifelong goal to be as helpful as possib...

#large-language-models #LLM #helpfulness #model-confidence #AI-safety #prompt-engineering
2 weeks ago · ai · - · -

A Guide to Fine-Tuning FunctionGemma

markdown FunctionGemma: Fine‑Tuning for Tool Selection Ambiguity Date: January 16, 2026 In the world of Agentic AI, the ability to call tools is what translates...

#FunctionGemma #fine‑tuning #tool‑calling #large language models #Gemma 3 #AI agents #Hugging Face #Google AI #function calling models

Newer posts

Older posts