[Paper] Gated KalmaNet: A Fading Memory Layer Through Test-Time Ridge Regression
As efficient alternatives to softmax Attention, linear state-space models (SSMs) achieve constant memory and linear compute, but maintain only a lossy, fading s...
4050 posts from this source
As efficient alternatives to softmax Attention, linear state-space models (SSMs) achieve constant memory and linear compute, but maintain only a lossy, fading s...
Large Language Models (LLMs) have proven efficient in giving definition-type answers to user input queries. While for humans giving various types of answers, su...
Evolutionary program synthesis systems such as AlphaEvolve, OpenEvolve, and ShinkaEvolve offer a new approach to AI-assisted mathematical discovery. These syste...
To meet strict Service-Level Objectives (SLOs),contemporary Large Language Models (LLMs) decouple the prefill and decoding stages and place them on separate GPU...
Agentic workflows have emerged as a powerful paradigm for solving complex, multi-stage tasks, but serving them at scale is computationally expensive given the m...
The scarcity of parallel speech corpora critically hampers speech-to-speech translation (S2ST), often forcing reliance on complex, multi-stage pipelines. This p...
Large Audio Language Models (LALMs) demonstrate impressive performance across diverse tasks, ranging from speech recognition to general audio understanding. How...
Traffic cameras are essential in urban areas, playing a crucial role in intelligent transportation systems. Multiple cameras at intersections enhance law enforc...
This empirical investigation elucidates the limitations of deterministic, unidimensional productivity heuristics by operationalizing the SPACE framework through...
Large language models (LLMs) are being increasingly adopted in the software engineering domain, yet the robustness of their grasp on core software design concep...
Quantum machine learning (QML) promises compact and expressive representations, but suffers from the measurement bottleneck - a narrow quantum-to-classical read...
The purpose of this article is to describe an adaptive decision-making support model aimed at improving the efficiency of engineering infrastructure reconstruct...
Machine learning models trained on real-world data may inadvertently make biased predictions that negatively impact marginalized communities. Reweighting is a m...
Training deep networks with noisy labels leads to poor generalization and degraded accuracy due to overfitting to label noise. Existing approaches for learning ...
Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and AR/VR. SpC builds a kernel map that stores mappings between input ...
'Train While You Fight' (TWYF) advocates for continuous learning that occurs during operations, not just before or after. This paper examines the technical requ...
Cloud-based storage platforms are becoming more common in both academic and business settings due to their flexible access to data and support for collaborative...
Existing C to Rust translation techniques fail to balance quality and scalability: transpilation-based approaches scale to large projects but produce code with ...
Microgrids are deployed to reduce purchased grid energy, limit exposure to volatile tariffs, and ensure service continuity during disturbances. This requires co...
The assignment of the pilot sequence is a critical challenge in massive MIMO systems, as sharing the same pilot sequence among multiple users causes interferenc...
Advanced Persistent Threats (APT) pose a major cybersecurity challenge due to their stealth, persistence, and adaptability. Traditional machine learning detecto...
Advanced Persistent Threats (APTs) pose a significant challenge in cybersecurity due to their stealthy and long-term nature. Modern supervised learning methods ...
Unit testing is an essential but resource-intensive step in software development, ensuring individual code units function correctly. This paper introduces Agone...
We describe a prototype of a fully capable Ethereum Proof-of-Work (PoW) blockchain network running on multiple Raspberry Pi (RPi) computers. The prototype is ea...
Building self-improving AI systems remains a fundamental challenge in the AI domain. We present NNGPT, an open-source framework that turns a large language mode...
The increasing availability of data and advancements in computational intelligence have accelerated the adoption of data-driven methods (DDMs) in product develo...
The rapid increase in LLM model sizes and the growing demand for long-context inference have made memory a critical bottleneck in GPU-accelerated serving system...
Parallel implementation of numerical adaptive mesh refinement (AMR)strategies for solving 3D elastostatic contact mechanics problems is an essential step toward...
Developing high-performance GPU kernels is critical for AI and scientific computing, but remains challenging due to its reliance on expert crafting and poor por...
Foundation models pre-trained with self-supervised learning (SSL) on large-scale datasets have become powerful general-purpose feature extractors. However, thei...
The Cognitive Buffer Hypothesis (CBH) posits that larger brains evolved to enhance survival in changing conditions. However, larger brains also carry higher ene...
Distributed storage systems typically maintain strong consistency between data nodes and metadata nodes by adopting ordered writes: 1) first installing data; 2)...
Asynchronous federated learning (FL) has recently gained attention for its enhanced efficiency and scalability, enabling local clients to send model updates to ...
Federated learning (FL) has been extensively studied as a privacy-preserving training paradigm. Recently, federated block coordinate descent scheme has become a...
In recent years, resource elasticity and cost optimization have become essential for RDBMSs. While cloud-native RDBMSs provide elastic computing resources via d...
Mobile agents have emerged as a powerful framework for solving fundamental graph problems in distributed settings in recent times. These agents, modelled as aut...
Version control relies on commit messages to convey the rationale for code changes, but these messages are often low quality and, more critically, inconsistent ...
Federated learning (FL) and split learning (SL) are two effective distributed learning paradigms in wireless networks, enabling collaborative model training acr...
Artificial intelligence-generated content (AIGC) service provisioning in wireless edge networks involves two phases: content generation on edge servers and cont...
Data-intensive scientific workflows increasingly rely on high-performance computing (HPC) systems, complementing traditional Grid and Cloud platforms. However, ...
Accelerator design languages (ADLs), high-level languages that compile to hardware units, help domain experts quickly design efficient application-specific hard...
Large language models (LLMs) and autonomous coding agents are increasingly used to generate software across a wide range of domains. Yet a core requirement rema...
LLM-based coding agents are increasingly common but still face challenges in context management, latency, reliability, reproducibility, and scalability. We pres...
AI-Integrated programming is emerging as a foundational paradigm for building intelligent systems with large language models (LLMs). Recent approaches such as M...
Recent advancements in large language models (LLMs) have shown very impressive capabilities in code generation across many programming languages. However, even ...
In complex systems with many compute nodes containing multiple CPUs that are coherent within each node, a key challenge is maintaining efficient and correct coh...
In recent years, machine learning and deep learning have driven advances in domains such as image classification, speech recognition, and anomaly detection by l...
In distributed computing a certification scheme consists of a set of states and conditions over those states that enable each node of a graph to efficiently ver...