[Paper] MetFuse: Figurative Fusion between Metonymy and Metaphor

Published: (April 14, 2026 at 12:02 PM EDT)
4 min read
Source: arXiv

Source: arXiv - 2604.12919v1

Overview

The paper “MetFuse: Figurative Fusion between Metonymy and Metaphor” tackles a surprisingly common linguistic phenomenon—sentences that blend two types of figurative language, metonymy and metaphor. While most NLP research treats these phenomena separately, the authors build a unified framework that can turn a plain sentence into three figurative versions (metonymic, metaphoric, and a hybrid of both) and release a high‑quality dataset (MetFuse) of 1,000 meaning‑aligned quadruplets (4,000 sentences total). Their experiments show that adding this data consistently boosts the performance of metonymy and metaphor classifiers across a range of benchmarks.

Key Contributions

  • Unified transformation framework that generates metonymic, metaphoric, and hybrid variants from a literal sentence.
  • MetFuse dataset: 1,000 human‑verified quadruplets (literal + metonymic + metaphoric + hybrid), the first resource dedicated to studying figurative fusion.
  • Empirical validation: Augmenting eight existing metonymy/metaphor benchmarks with MetFuse improves classification accuracy, especially for metonymy when hybrid examples are added.
  • Cross‑figurative analysis: Demonstrates that the presence of a metaphor makes a metonymic noun easier for both humans and large language models (LLMs) to detect.
  • Open‑source release: Dataset and code are publicly available, encouraging further research on multi‑figurative language understanding.

Methodology

  1. Sentence Construction

    • Start with a literal sentence (e.g., “The crown announced new tax reforms”).
    • Apply a set of linguistic rules and crowd‑sourced rewrites to produce:
      • a metonymic version (where a part stands for a whole, e.g., “The crown announced… → “The monarchy announced…”),
      • a metaphoric version (where one concept is described in terms of another, e.g., “The crown announced… → “The kingdom’s head announced…”), and
      • a hybrid version that combines both transformations.
  2. Human Verification

    • Each quadruplet is reviewed by multiple annotators to ensure the intended figurative meaning is preserved and aligned across the four sentences.
  3. Dataset Integration & Evaluation

    • The MetFuse quadruplets are mixed into the training sets of eight public metonymy/metaphor classification benchmarks.
    • Standard classifiers (BERT, RoBERTa, etc.) are fine‑tuned on the augmented data.
    • Performance is measured with accuracy/F1 and compared against baselines trained without MetFuse.
  4. Analysis of Figurative Interaction

    • Conduct probing experiments where models (and human annotators) are asked to label the figurative type of sentences that are either pure metonymy, pure metaphor, or hybrid.
    • Compare detection rates to quantify the “boost” effect of one figurative type on the other.

Results & Findings

TaskBaseline (no MetFuse)+ MetFuse (Hybrid)% Gain
Metonymy classification (4 benchmarks)78.2 % F182.7 % F1+4.5 %
Metaphor classification (4 benchmarks)81.5 % F184.1 % F1+2.6 %
  • Hybrid examples deliver the biggest lift for metonymy tasks, confirming that the metaphorical context clarifies the metonymic cue.
  • Human annotators identified metonymy correctly in 71 % of hybrid sentences vs. 58 % in metonymy‑only sentences.
  • LLMs (GPT‑4, Llama‑2) showed the same trend, with a 6‑point F1 improvement on hybrid inputs.
  • Error analysis revealed that most remaining mistakes involve rare proper nouns or domain‑specific jargon, suggesting that further lexical coverage could help.

Practical Implications

  • Better figurative language handling in downstream apps – chatbots, voice assistants, and content moderation tools can more reliably interpret statements like “The White House announced…” when a metaphor is also present.
  • Improved data augmentation pipelines – developers can automatically generate hybrid figurative variants to enrich training data for any task that benefits from nuanced meaning (e.g., sentiment analysis, intent detection).
  • Enhanced LLM prompting – prompting strategies that explicitly ask the model to consider both metonymic and metaphorical cues can yield more accurate explanations or paraphrases.
  • Cross‑domain transfer – the framework can be adapted to domain‑specific corpora (legal, medical) where metonymic shorthand (“the bench” for judges) often co‑occurs with metaphorical language, leading to more robust domain‑adapted models.

Limitations & Future Work

  • Scope of lexical items – MetFuse focuses mainly on nouns that are classic metonymic targets; extending to verbs and adjectives remains an open challenge.
  • Cultural and language diversity – The dataset is English‑centric; figurative fusion behaves differently in other languages and cultures, so multilingual extensions are needed.
  • Model size dependence – Gains were more pronounced for mid‑size transformers; very large LLMs already capture some figurative cues, reducing the marginal benefit.
  • Future directions proposed by the authors include:
    1. Scaling the framework to automatically generate larger corpora.
    2. Exploring joint multi‑task learning for metonymy, metaphor, and other figurative devices (irony, sarcasm).
    3. Integrating the dataset into evaluation suites for LLMs’ figurative reasoning abilities.

Authors

  • Saptarshi Ghosh
  • Tianyu Jiang

Paper Information

  • arXiv ID: 2604.12919v1
  • Categories: cs.CL
  • Published: April 14, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »