[논문] CollabSim: CSCW 기반, 제어된 다중 에이전트 실험으로 LLM 에이전트 협업 능력 조사
Multi-agent systems (MAS) built on large language models have shown growing promise, with their effectiveness resting on agents' ability to coordinate through t...
1025 posts from this source
Multi-agent systems (MAS) built on large language models have shown growing promise, with their effectiveness resting on agents' ability to coordinate through t...
Current evaluation practices in relational learning rely heavily on flat leaderboards that average performance across heterogeneous datasets, implicitly assumin...
Proper scoring rules provide a rigorous theoretical basis for the training and evaluation of probabilistic forecasts. However, in the presence of right censorin...
Indoor scene generation is crucial for robot simulation and modern interior design. However, complex layouts together with scarce 3D scene data make learning-ba...
최근 LLM 에이전트의 발전으로 다단계 추론, 계획, 도구 사용과 같은 복잡한 인지 능력이 가능해졌으며, 이는 점점 더 ...
This paper presents the issues arising in implementing a fast integer division algorithm on general purpose GPUs. The algorithm uses a Newton iteration based on...
We analyze the discrete incremental voting process (DIV) introduced by Cooper, Radzik, and Shiraga [OPODIS '23]. In this process, we consider a set V of n nodes...
The question of whether artificial systems can be conscious remains open, in part because existing approaches either evaluate systems against theory-derived che...
Medical vision-language models (VLMs) have shown increasing potential for clinical image interpretation, including lesion detection and report generation. Howev...
학습 기반 Scene Graph Generation (SGG) 모델은 빈번한 관계 유형에서는 뛰어나지만, 주석 희소성 하에서는 급격히 성능이 저하되어 신뢰할 수 있는 ...
Urban green-space extraction from ultra-high-resolution (UHR) imagery is commonly performed patch by patch, which limits semantic reuse among spatially separate...
Image-to-Video diffusion models는 입력 이미지를 활용하여 시각적으로 놀라운 콘텐츠를 생성하지만, 종종 물리 법칙을 위반하는 움직임을 만들어냅니다. 우리는 …
In this study, UAV multispectral imagery is used to segment the severity of bacterial leaf blight (BLB) in rice using convolutional neural networks (CNNs) and t...
Reliable rubric grading requires more than accurate score prediction. Each judgement must be grounded in the mark scheme and evidence from the student answer. E...
Several of the world's languages are still under-resourced in terms of Natural Language Processing (NLP) tools. This is mostly due to the lack of high-quality d...
비디오 질문 응답(VideoQA)은 주어진 비디오에 대한 질문에 답하는 것을 목표로 합니다. 기존 접근 방식은 사실형 VideoQA에서는 뛰어나지만, 깊이 있는 비디오…
Estimating local mean curvature at each point of a high-dimensional dataset is a key ingredient of geometry-aware machine learning algorithms, such as the Mean ...
LLM-based agents increasingly rely on harnesses that provide execution environments, tool interfaces, context, lifecycle orchestration, observability, verificat...
Machine unlearning aims to remove targeted knowledge from a trained model while preserving its general capabilities. For autoregressive language models, not all...
Diffusion Transformers (DiTs)를 기반으로 한 비디오 생성 모델은 비디오 합성에서 놀라운 성능을 달성했지만, 높은 추론 지연 시간으로 어려움을 겪고 있습니다.
Factual sycophancy occurs when a language model abandons a correct, verifiable answer under social pressure. Because a flip occurs only when pressure toward a f...
Multi-turn Large Language Model (LLM) serving is critical for consistent user experiences, yet the linear growth of the Key-Value (KV) cache imposes significant...
Agentic AI is increasingly being integrated into software engineering workflows. In crowdsourced testing, however, the large volume and uneven quality of submit...
Temporal Grounding (TG)은 텍스트 쿼리에 해당하는 비디오 세그먼트를 위치 지정하는 것을 목표로 합니다. 기존 연구는 주로 단일 세그먼트 검색에 초점을 맞추었습니다. 실제…
Robotic manipulation of textiles는 continuous deformation과 self-occlusions 때문에 estimate에 필요한 robust visual perception을 방해받아 여전히 어려운 과제이다.
Blind image restoration requires recovering clean images from observations corrupted by unknown and potentially mixed degradations. While recent deterministic f...
Student engagement is critical for effective learning in software modelling, yet fostering motivation and inclusivity remains a challenge. While existing resear...
포인트 클라우드는 로봇 인식의 기본 감각 표현으로, LiDAR 기반 자율 주행, 동시 위치 추정 및 지도 작성(SLAM) 등을 뒷받침합니다.
Transformer-based multimodal models rely on attention mechanisms to integrate information across heterogeneous modalities. Despite their success, existing multi...
Institutional documents contain substantial amounts of operational and analytical information embedded within figures and tables. Current approaches for extract...
Correctness and readability are key measures of code quality, respectively ensuring functional fidelity and ease of comprehension. While most existing research ...
Finding manifold structures in noisy and high-dimensional point clouds is a challenging but important problem. In astronomical observation survey and simulation...
Spiking neural networks (SNNs) have the potential to emerge as the third generation of neural networks and have attracted increasing attention across a wide ran...
TLA+ is a formal specification language for verifying distributed systems and safety-critical protocols. Large language models (LLMs) frequently produce TLA+ sp...
When porting high-performance computing (HPC) code from CPU to GPU, CPU-oriented optimizations may obstruct LLM-based CUDA translation. We design and evaluate a...
Multiple machine learning models can achieve near-equivalent predictive performance on the same task, yet provide divergent feature-based explanations. This is ...
As robotic systems become more sophisticated, the growing complexity of their motion planning models and the longer training times pose substantial challenges. ...
Noisy evolution strategies under fixed evaluation budgets face a depth-fidelity trade-off: spending evaluations to denoise intra-generation rankings reduces the...
Uncertainty quantification in neural networks prediction is a main issue for usual applications. Our approach seeks at reducing computation costs by directly ev...
NVSHMEM is NVIDIA's OpenSHMEM-based PGAS communication library for GPU clusters, enabling GPU-initiated, one-sided communication through symmetric memory. Despi...
With the rapid growth of interactive applications in large language model (LLM) online services, maintaining high system throughput while ensuring user-perceive...
Existing code-generation benchmarks score a single mapping from a complete prompt to a one-shot output. However, real web development is different. Users seldom...
This paper provides and analyzes a dataset detailing the characteristics and execution data of all jobs submitted to the IN2P3 Computing Center (Villeurbanne, F...
TLA+ has supported industrial verification at companies such as Amazon and Microsoft, yet writing correct TLA+ specifications from natural language still requir...
AI is changing how software engineers work, but it often comes with hidden burdens and costs. In this paper, we characterize two such often-overlooked burdens: ...
Large language models and AI coding agents have reshaped software development, but the path to fully AI-native systems faces structural challenges. Chief among ...
Turning a promising economic idea into a credible empirical finding is, in practice, an expensive undertaking: it demands a great deal of specialised computatio...
Forward-Forward (FF) learning [Hinton, 2022] replaces backpropagation with strictly layer-local goodness updates. Recent FF-CNN work has narrowed the gap to BP ...