[Paper] AirSim360: A Panoramic Simulation Platform within Drone View
The field of 360-degree omnidirectional understanding has been receiving increasing attention for advancing spatial intelligence. However, the lack of large-sca...
3997 posts from this source
The field of 360-degree omnidirectional understanding has been receiving increasing attention for advancing spatial intelligence. However, the lack of large-sca...
Test-time scaling (TTS) -- the dynamic allocation of compute during inference -- is a promising direction for improving reasoning in large language models (LLMs...
Multi-view camera systems enable rich observations of complex real-world scenes, and understanding dynamic objects in multi-view settings has become central to ...
We introduce Audio-Visual Affordance Grounding (AV-AG), a new task that segments object interaction regions from action sounds. Unlike existing approaches that ...
Large Language Models (LLMs) encode factual knowledge within hidden parametric spaces that are difficult to inspect or control. While Sparse Autoencoders (SAEs)...
Massively parallel simulation has reduced reinforcement learning (RL) training time for robots from days to minutes. However, achieving fast and reliable sim-to...
Autonomous driving policies are typically trained via open-loop behavior cloning of human demonstrations. However, such policies suffer from covariate shift whe...
We introduce LLM CHESS, an evaluation framework designed to probe the generalization of reasoning and instruction-following abilities in large language models (...
Offline Reinforcement Learning (RL) provides a promising avenue for training policies from pre-collected datasets when gathering additional interaction data is ...
Study Objectives: Wrist accelerometry is widely used for inferring sleep-wake state. Previous works demonstrated poor wake detection, without cross-device gener...
Federated Learning (FL) on resource-constrained edge devices faces a critical challenge: The computational energy required for training Deep Neural Networks (DN...
GUI grounding aims to align natural language instructions with precise regions in complex user interfaces. Advanced multimodal large language models show strong...
The global capacity for mineral processing must expand rapidly to meet the demand for critical minerals, which are essential for building the clean energy techn...
The mechanism by which RL contributes to reasoning capabilities-whether it incentivizes the synthesis of new skills or merely amplifies existing behaviors-remai...
Deep Research Agents (DRAs) aim to automatically produce analyst-level reports through iterative information retrieval and synthesis. However, most existing DRA...
Reinforcement Learning with Verifiable Rewards (RLVR) has advanced the reasoning capability of large language models (LLMs), enabling autonomous agents that can...
The rise of large language models (LLMs) has sparked a surge of interest in agents, leading to the rapid growth of agent frameworks. Agent frameworks are softwa...
Recent advancements in large language models (LLMs) have been driven by their emergent reasoning capabilities, particularly through long chain-of-thought (CoT) ...
Understanding the internal thinking process of Large Language Models (LLMs) and the cause of hallucinations remains a key challenge. To this end, we introduce l...
The growth of the Internet of Things has enabled a new generation of applications, pushing computation and intelligence toward the network edge. This trend, how...
Detailed trace analysis of MPI applications is essential for performance engineering, but growing trace sizes and complex communication behaviour often render c...
This paper analyzes the integration of artificial intelligence (AI) with mixed integer linear programming (MILP) to address complex optimization challenges in a...
Automated test generation has become a key technique for ensuring software quality, particularly in modern API-based architectures. However, automatically gener...
Handling static images that lack inherent temporal dynamics remains a fundamental challenge for spiking neural networks (SNNs). In directly trained SNNs, static...
Symbolic Regression (SR) is a regression method that aims to discover mathematical expressions that describe the relationship between variables, and it is often...
Graph Neural Networks (GNNs) present a fundamental hardware challenge by fusing irregular, memory-bound graph traversals with regular, compute-intensive dense m...
Digital Twins (DTs) are increasingly used as autonomous decision-makers in complex socio-technical systems. Their mathematically optimal decisions often diverge...
Software plays an ever increasing role in complex system development and prototyping, and in recent years, MIT Lincoln Laboratory has sought to improve both the...
Relational data, occurring in the real world, are often structured as graphs, which provide the logical abstraction required to make analytical derivations simp...
Software supply chain attacks have revealed blind spots in existing SCA tools, which are often limited to a single ecosystem and assess either software artifact...
Advanced deep learning architectures, particularly recurrent neural networks (RNNs), have been widely applied in audio, bioacoustic, and biomedical signal analy...
This paper explores the integration of MPI-based synchronization techniques into distributed fuzzing frameworks, highlighting possible substantial performance i...
Fuzzing is a highly effective method for uncovering software vulnerabilities, but analyzing the resulting data typically requires substantial manual effort. Thi...
In many academic disciplines, software is created during the research process or for a research purpose. The crucial role of software for research is increasing...
Federated Learning is a popular approach for distributed learning due to its security and computational benefits. With the advent of powerful devices in the net...
Covid has made online teaching and learning acceptable and students, faculty, and industry professionals are all comfortable with this mode. This comfort can be...
We present Conformer-based decoders for the LibriBrain 2025 PNPL competition, targeting two foundational MEG tasks: Speech Detection and Phoneme Classification....
Many modern software projects evolve rapidly to incorporate new features and security patches. It is important for users to update their dependencies to safer v...
Serverless Large Language Models (LLMs) have emerged as a cost-effective solution for deploying AI services by enabling a 'pay-as-you-go' pricing model through ...
This paper introduces a new family of multi-parent recombination operators for Genetic Algorithms (GAs), based on normalized Pascal (binomial) coefficients. Unl...
In this paper we investigate a neural network model in which weights between computational nodes are modified according to a local learning rule. To determine w...
The Machine Consciousness Hypothesis states that consciousness is a substrate-free functional property of computational systems capable of second-order percepti...
Inference over large-scale foundation models within heterogeneous edge environments necessitates a fundamentally reconfigurable orchestration substrate. Static ...
Federated fine-tuning offers a promising solution for adapting Large Language Models (LLMs) to downstream tasks while safeguarding data privacy. However, its hi...
Microservices have transformed software architecture through the creation of modular and independent services. However, they introduce operational complexities ...
Quality-Diversity (QD) algorithms constitute a branch of optimization that is concerned with discovering a diverse and high-quality set of solutions to an optim...
As large language models (LLMs) scale out with tensor parallelism (TP) and pipeline parallelism (PP) and production stacks have aggressively optimized the data ...
Reasoning over dynamic visual content remains a central challenge for multimodal large language models. Recent thinking models generate explicit reasoning trace...