Qlik Regex를 사용하여 지저분한 코멘트 필드 정리

발행: 2개월 전 (2026년 2월 17일 오전 02:06 GMT+9)

3 분 소요

원문: Dev.to

Source: Dev.to

Free‑text fields are both a blessing and a curse.
They give users flexibility, but from a reporting point of view they’re often messy, inconsistent, and hard to work with. In this case, a customer had a Comments field where users manually typed information, but inside that text was a specific code they needed for reporting.

The requirement was simple on paper:

Extract a code from a free‑text comments field.

In reality… not so simple.

The Problem

The source field looked something like this:

"Breakdown reported – REF12345 vehicle unavailable"
"Job completed ABC 67890 no further action"
"Awaiting parts – no reference provided"

The rules for the code were consistent:

3 uppercase letters
Optional space
5 digits

Examples of valid codes:

ABC12345
ABC 12345

The challenge:

The code could appear anywhere in the text.
Some rows didn’t contain a code at all.
We needed a clean output suitable for filtering and analysis.

Why Regular Expressions?

You could try to solve this with string functions like Mid(), Left(), or Index(), but that approach quickly falls apart when:

The position of the code varies.
The spacing isn’t consistent.
The text contains lots of noise.

Regular expressions (regex) are designed for exactly this kind of pattern matching.

The Solution

This is the expression I used:

Coalesce(
    Replace(
        ExtractRegEx(
            [Comments],
            '[A-Z]{3}\s?\d{5}',
            1
        ),
        ' ',
        ''
    ),
    'NOT FOUND'
)

Step 1: Extract the Code with `ExtractRegEx`

ExtractRegEx(
    [Comments],
    '[A-Z]{3}\s?\d{5}',
    1
)

Regex pattern

[A-Z]{3} → exactly 3 uppercase letters
\s? → optional space
\d{5} → exactly 5 digits

This tells Qlik: “Find me the first occurrence of three letters followed by five numbers, with or without a space.”
If no match is found, ExtractRegEx() returns NULL.

Step 2: Remove the Space (If It Exists)

Replace(…, ' ', '')

If the original text contained ABC 12345, this step converts it to ABC12345.
The output is always consistent, which is crucial for joins, filters, and comparisons later on.

Step 3: Handle Missing Codes with `Coalesce`

Coalesce(…, 'NOT FOUND')

When no matching code exists, ExtractRegEx() returns NULL. Wrapping everything in Coalesce() lets us replace those NULLs with something meaningful: NOT FOUND. This makes the result:

Easier to understand for users
Easier to filter in the app
Safer for downstream logic

Final Result

Comments Text	Extracted Code
Breakdown reported – REF12345 vehicle unavailable	REF12345
Job completed ABC 67890 no further action	ABC67890
Awaiting parts – no reference provided	NOT FOUND

Clean, consistent, and analysis‑ready.

Why This Matters

Using regex in Qlik allows you to:

Extract structured data from unstructured text
Reduce manual data cleansing
Build more reliable KPIs and dimensions
Handle real‑world, messy user input gracefully

If you work with comments, notes, descriptions, or reference fields, regex is one of the most powerful tools you can add to your Qlik toolbox.

Qlik Regex를 사용하여 지저분한 코멘트 필드 정리

The Problem

Why Regular Expressions?

The Solution

Step 1: Extract the Code with `ExtractRegEx`

Step 2: Remove the Space (If It Exists)

Step 3: Handle Missing Codes with `Coalesce`

Final Result

Why This Matters

관련 글

Show HN: Journey – 맞춤형 2D ECS 게임 엔진, Rust와 WGPU로 작성

파트 3: Testing, Documentation & Deployment 🚀

보이지 않는 레이어: HTTP 캐싱 마스터링 (Part 2)

파트 2: dbt 프로젝트 구조 및 모델 구축 📁

The Problem

Why Regular Expressions?

The Solution

Step 1: Extract the Code with ExtractRegEx

Step 2: Remove the Space (If It Exists)

Step 3: Handle Missing Codes with Coalesce

Final Result

Why This Matters

관련 글

Show HN: Journey – 맞춤형 2D ECS 게임 엔진, Rust와 WGPU로 작성

파트 3: Testing, Documentation & Deployment 🚀

보이지 않는 레이어: HTTP 캐싱 마스터링 (Part 2)

파트 2: dbt 프로젝트 구조 및 모델 구축 📁

Step 1: Extract the Code with `ExtractRegEx`

Step 2: Remove the Space (If It Exists)

Step 3: Handle Missing Codes with `Coalesce`