Statistics - Measures of Position In Data Science

Published: (December 24, 2025 at 12:44 AM EST)
3 min read
Source: Dev.to

Source: Dev.to

What Are Measures of Position?

Measures of position describe where a particular data value stands relative to the rest of the dataset.

  • Is this value high, low, or typical?
  • What proportion of data lies below (or above) a given value?
  • How extreme is a data point?

Unlike measures of central tendency (mean, median) or dispersion (variance, standard deviation), measures of position focus on relative standing.

Why Measures of Position Matter in Data Science

In data science, measures of position are crucial for:

  • Outlier detection (e.g., IQR method)
  • Feature scaling and normalization
  • Risk assessment (finance, insurance)
  • Model evaluation (percentile‑based metrics)
  • Fair comparisons across populations

Example: A test score of 85 means very different things depending on whether it is in the 60th percentile or the 95th percentile.

Types of Measures of Position

  • Percentiles
  • Quartiles
  • Deciles
  • Z‑scores (Standard Scores)
  • Ranks

Percentiles

Definition
The p‑th percentile is the value below which p % of the data falls.

Properties

  • Range from 0 to 100
  • Not evenly spaced in value—depends on the data distribution

How to Compute Percentiles

Given ordered data of size n:

[ \text{Position of } P_p = \frac{p}{100},(n+1) ]

If the position is not an integer, interpolate between the surrounding values.

Example

Data (ordered): 10, 20, 30, 40, 50

Find the 60th percentile:

[ P_{60} = \frac{60}{100},(5+1) = 3.6 ]

The 3.6‑th position lies between the 3rd (30) and 4th (40) values:

[ 30 + 0.6,(40-30) = 36 ]

So, (P_{60}=36).

Interpretation

  • 60 % of the data ≤ 36
  • 40 % of the data ≥ 36

Quartiles

Quartiles divide data into four equal parts.

QuartileMeaning
Q125th percentile
Q2Median (50th percentile)
Q375th percentile

Interquartile Range (IQR)

[ \text{IQR}=Q_3 - Q_1 ]

  • Measures the spread of the middle 50 % of the data
  • Robust to outliers; heavily used in box plots and anomaly detection

Outlier Detection (IQR Rule)

Values below (Q_1 - 1.5\cdot\text{IQR}) or above (Q_3 + 1.5\cdot\text{IQR}) are considered outliers.

Deciles

Deciles split data into ten equal parts.

DecileCorresponding Percentile
D110 %
D220 %
D990 %

Applications

  • Income distribution analysis
  • Population studies
  • Risk stratification

Example: Top 10 % income earners are those above the 9th decile.

Z‑Scores (Standard Scores)

Definition

A Z‑score measures how many standard deviations a value is from the mean:

[ Z = \frac{x - \mu}{\sigma} ]

where

  • (x) = observation
  • (\mu) = mean of the distribution
  • (\sigma) = standard deviation

Interpretation

  • Standardizes different scales, enabling comparison across datasets
  • Fundamental in machine‑learning preprocessing
  • Basis for normal‑distribution probabilities

Example

Mean = 70, (\sigma = 10)

[ Z = \frac{85 - 70}{10} = 1.5 ]

The score is 1.5 standard deviations above the mean.

Relationship Between Z‑Scores and Percentiles (normal distribution)

ZApproximate Percentile
-22.5 %
-116 %
050 %
184 %
297.5 %

This connection is vital for hypothesis testing, probability estimation, and statistical modelling.

Ranks

Definition
A rank assigns an ordinal position to each observation.

Example

  • Highest score → Rank 1
  • Next highest → Rank 2

Types of Ranking

  • Dense ranking (1, 2, 2, 3)
  • Competition ranking (1, 2, 2, 4)
  • Fractional ranking (average rank for ties, e.g., 2.5)

Limitations

  • Ignores magnitude differences
  • Not suitable for distance‑based models

Measures of Position vs. Measures of Central Tendency

AspectCentral Tendency (Typical Value)Position (Relative Standing)
FocusTypical valueRelative standing
ExamplesMean, MedianPercentiles, Z‑scores
OutliersSensitive (mean)Often robust
Use in MLBaselineFeature scaling, anomaly detection

Real‑World Data Science Applications

  • Machine Learning

    • Feature normalization using Z‑scores
    • Quantile transformation
  • Finance

    • Value‑at‑Risk (VaR) – percentile‑based
    • Risk classification using deciles
  • Healthcare

    • Growth percentiles (BMI‑for‑age)
    • Lab result interpretation
  • Education

    • Standardized test scores
    • Admission cut‑offs

Summary Table

MeasurePurposeRobust to Outliers
PercentileRelative positionYes
QuartileSpread & positionYes
DecileDistribution segmentationYes
Z‑scoreStandardized distanceNo
RankOrder comparisonYes

Key Takeaways

  • Measures of position explain where a value lies, not just what it is.
  • Percentiles and quartiles are distribution‑free; Z‑scores assume normality but enable deep comparisons.
  • In data science, they are foundational for scaling, anomaly detection, and interpretation.
Back to Blog

Related posts

Read more »