STATISTICS - Uni-variate Non-Graphical Exploratory Data Analysis (EDA)

Published: (December 25, 2025 at 11:56 PM EST)
2 min read
Source: Dev.to

Source: Dev.to

Meaning

  • Uni‑variate – only one variable is analyzed
  • Non‑Graphical – uses numbers and statistics, not plots
  • Exploratory – no assumptions; aims to discover patterns, anomalies, and summaries

📌 Example variables: exam marks, age, income, daily sales, temperature.

Objectives

  • Summarize the data numerically
  • Identify central tendency
  • Measure variability (dispersion)
  • Understand relative position of values
  • Detect outliers
  • Assess distribution shape
  • Check data quality

Techniques Used in Uni‑variate Non‑Graphical EDA

Measures of Central Tendency

Describe the typical or centre value.

  • Mean

    [ \bar{x}= \frac{\sum x}{n} ]

    Most common average; highly affected by outliers.

  • Median – middle value of ordered data; resistant to extreme values.

  • Mode – most frequent value; useful for discrete or categorical data.

Measures of Dispersion

Describe how spread out the data is.

  • Range

    [ \text{Range}= \text{Max} - \text{Min} ]

  • Variance

    [ \sigma^{2}= \frac{\sum (x-\bar{x})^{2}}{n} ]

  • Standard Deviation

    [ \sigma = \sqrt{\sigma^{2}} ]

    Most widely used spread measure.

  • Inter‑quartile Range (IQR)

    [ \text{IQR}= Q_{3} - Q_{1} ]

    Spread of the middle 50 %; less affected by outliers.

Measures of Position

Describe relative standing of values.

  • Percentiles (e.g., P10, P50, P90)
  • Quartiles (Q1, Q2, Q3)
  • Deciles (D1 to D9)

📌 Example: the 75th percentile means 75 % of data lie below it.

Measures of Distribution Shape

  • Skewness

    • Positive skew → right tail longer
    • Negative skew → left tail longer
    • Zero skew → symmetrical distribution
  • Kurtosis – measures peakedness or tail thickness

    • Leptokurtic → sharp peak
    • Mesokurtic → normal (Gaussian)
    • Platykurtic → flat

More on distribution shape

Outlier Detection (Non‑Graphical)

IQR Method

[ \text{Lower limit}= Q_{1} - 1.5 \times \text{IQR} ]

[ \text{Upper limit}= Q_{3} + 1.5 \times \text{IQR} ]

Values outside these limits are considered outliers.

Z‑Score Method

[ z = \frac{x - \mu}{\sigma} ]

[ |z| > 3 ;; \rightarrow ;; \text{Potential outlier} ]

Data Quality Checks

Uni‑variate Non‑Graphical EDA helps detect:

  • Missing values
  • Invalid values (e.g., negative age)
  • Extreme or impossible values
  • Data entry errors

Advantages

  • Simple and fast
  • No visualization required
  • Works well for summaries
  • Ideal for exam and theory questions

Limitations

  • No visual insight
  • Cannot show trends
  • Less intuitive for large datasets

Example

Data: 10, 12, 15, 18, 20, 25, 40

  • Mean = 20
  • Median = 18
  • Range = 30
  • IQR = moderate (Q1 = 12, Q3 = 25 → IQR = 13)
  • Skewness = positive (long right tail)
  • Outlier = 40 (outside the IQR upper limit)

Conclusion

Conclusion Image

Uni‑variate Non‑Graphical Exploratory Data Analysis is a numerical approach to understand a single variable by analyzing its centre, spread, position, shape, and quality—without using graphs. It serves as a foundational step before more advanced statistical analysis.

Back to Blog

Related posts

Read more »

Relational databases via ODBC

Introduction With a different function and often a different package for almost every file format, it’s easy to feel overwhelmed—especially when juggling multi...