AI Browser Updates: How Far Have We Come?

Published: (December 6, 2025 at 11:29 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

📊 Current Status

  • ✅ Received valuable feedback and suggestions
  • ✅ Developers are starting to follow and explore the project
  • ✅ Cross‑platform support (Mac, Windows) running stably

🎉 What We’ve Built Recently

1. History Playback + Continue Conversations

Previous pain point: History was read‑only, couldn’t continue.

Now:

  • ✅ Click any historical task to replay the full execution (with typewriter effects)
  • ✅ Play/pause/speed control
  • Continue the conversation from where you left off
  • ✅ Preview attached files directly

Technical Implementation

We built a PlaybackEngine that breaks message streams into atomic fragments (AtomicFragment)—the smallest replayable units. This allows precise control over playback progress and speed. Task data is persisted via IndexedDB for offline viewing. When resuming, we restore the complete execution context (workflow, steps, attachments, etc.) to ensure seamless continuation.

2. Human Interaction Capability

Scenario: AI encounters situations requiring human decisions.

Solution:

  • ✅ AI can ask questions during execution
  • ✅ After you respond, AI continues
  • ✅ Useful for login confirmations, option selections, etc.

Example:

Task: Help me collect data from a login‑required website

AI: Login required. Are you logged in?
You: Yes, already logged in
AI: Got it, continuing data collection...

Technical Implementation

Based on the eko framework’s HumanInteraction message type, AI can initiate interaction requests during execution. We established a bidirectional communication channel between the main and renderer processes via Electron IPC. When AI needs to ask, the workflow pauses and waits for user response. After answering via IPC, the agent resumes execution. The process includes full state management and error handling.

3. Voice Input Support

Features:

  • ✅ Voice input for tasks (no typing needed)
  • ✅ Offline speech recognition with Vosk
  • ✅ Auto‑switch recognition models based on language

Technical Implementation

Vosk’s local offline engine is used by default—no internet required, protecting user privacy. The appropriate model (Chinese/English) loads automatically. Future support for Microsoft Azure and iFlytek cloud services is planned.

4. Multi‑Language Internationalization

Support:

  • ✅ Chinese/English interface switching
  • ✅ Complete translation coverage
  • ✅ Date/time localization

Technical Implementation

Built on i18next + react-i18next. Translation resources are organized by module (main.json, history.json, agent-config.json, etc.) with namespace isolation. Language switching uses Zustand global state—no page refresh needed. Date/time formatting leverages date-fns locale functionality. Adding new languages only requires new JSON translation files.

5. Agent Configuration System

Features:

  • ✅ Customize agent prompts
  • ✅ Manage MCP tools (CRUD)
  • ✅ Configure different agent capabilities

This makes AI Browser highly flexible and customizable.

6. Toolbox Page

Improvements:

  • ✅ Centralized access to all system features
  • ✅ Clearer navigation
  • ✅ One‑click jump to config, scheduled tasks, history, etc.

🗺️ What’s Next

Phase 1 (Near‑term, 1‑2 weeks)

  • Task Working Directory Isolation – each task gets an independent working directory to avoid file interference.
  • Windows Background Running Optimization – reduce resource usage and improve stability.
  • Generated File Download Support – direct and batch download of AI‑generated files.
  • Playback Speed Control – fast‑forward/slow‑motion for history playback.

Phase 2 (Mid‑term, 2‑4 weeks)

  • Performance Optimization – virtual scrolling for long conversations, memory improvements, faster startup.
  • Multi‑Language Enhancement – auto‑detect system language, dynamic download of offline language packages, configurable online speech recognition (Microsoft, iFlytek).
  • Theme Customization – dark mode, multiple color schemes, user‑defined colors.

Phase 3 (Long‑term, 1‑2 months)

  • Visual Workflow Editor – adjust workflow steps, save/import specific workflows for scheduled tasks.
  • Plugin Marketplace – official MCP tool library (HTTP, stdio, SSE), community plugin sharing, one‑click install/update.
  • More Agent Support – ShellAgent (command execution), EmailAgent (email send/receive), NotionAgent (Notion operations).

🤔 What We Need

  1. ⭐️ Stars – helps the project gain visibility, attract contributors, and motivates continued development.
  2. 💬 Feedback and Suggestions – share your use cases, problems, and feature ideas via GitHub Issues or comments.
  3. 🤝 Code Contributions – submit PRs for bug fixes, new features, or documentation improvements.
  • GitHub:
  • Download:
  • Configuration Guide:
  • Issue Tracker:
Back to Blog

Related posts

Read more »

g

Forem Feed !Forem Logohttps://media2.dev.to/dynamic/image/width=65,height=,fit=scale-down,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.co...