[Paper] Exploring Hidden Geographic Disparities in Android Apps
Source: arXiv - 2511.21151v1
Overview
The paper uncovers a hidden layer of geographic bias in Android apps: the same app can behave differently depending on where it’s downloaded. By systematically crawling Google Play from many countries, the authors identify “GeoTwins” – visually identical apps that differ in permissions, third‑party SDKs, and privacy notices – and show that even the core base.apk of an Android App Bundle can vary by region. These findings shake the assumption that an app’s code is uniform worldwide and raise security, fairness, and reproducibility concerns for developers, researchers, and regulators.
Key Contributions
- Definition and large‑scale detection of “GeoTwins.” 81,963 pairs of apps that share branding and functionality but are published under different package names across countries.
- Empirical evidence that
base.apkfiles differ by region within the Android App Bundle model, contrary to the belief that the base module is identical worldwide. - Quantitative analysis of permission, library, and privacy‑policy divergence among GeoTwins, highlighting systematic regional disparities.
- Demonstration of real‑world impact: the same app can be classified as benign in one country’s malware dataset and malicious in another’s, undermining reproducibility of security studies.
- Release of a curated dataset (GeoTwins list, region‑specific APK hashes, metadata) to enable follow‑up research.
Methodology
- Distributed Collection Pipeline – Deployed virtual devices in 12‑plus geographic locations (via VPNs and cloud providers) to query Google Play as a local user would.
- App Matching & Twin Detection – Grouped apps by visual similarity (icon, name, description) and then filtered for distinct package names and differing region‑specific download URLs.
- Static Analysis – Extracted permissions, embedded third‑party libraries, and privacy‑policy URLs from each APK. For App Bundles, the
base.apkwas unpacked and hashed per region. - Cross‑Region Comparison – Computed Jaccard similarity scores for permission sets and library lists, and measured hash collisions to spot divergent base modules.
- Validation – Randomly sampled twins for manual inspection to confirm functional equivalence and to rule out false positives (e.g., localized language packs).
Results & Findings
- GeoTwins are common: ~7 % of the 1.2 M apps collected formed GeoTwin pairs, with a median of 3 regional variants per app.
- Permission drift: 42 % of twins request at least one permission that the counterpart does not, often adding location or SMS permissions in specific markets.
- Library variance: Region‑specific SDKs (e.g., ad networks, analytics) appear in 35 % of twins, with some markets embedding more aggressive tracking libraries.
- Base.apk divergence: 18 % of App Bundles show different
base.apkhashes across regions, indicating hidden code paths or feature toggles. - Security classification flip: In a standard malware scanner benchmark, 12 % of twins switched from “clean” to “suspicious” when evaluated with the region‑specific variant.
Practical Implications
- Developers: Must audit regional builds for unintended permission creep or third‑party SDK inclusion. CI pipelines should incorporate multi‑region builds and automated diff checks.
- Security Researchers: Need to fetch apps from multiple locales to avoid biased datasets; reproducibility studies should disclose the region of collection.
- App Store Operators: Should surface region‑specific change logs and permission differences to users, improving transparency and consent.
- Enterprise Mobile Management (EMM) & MDM tools: Should treat each regional variant as a distinct asset when enforcing policies or scanning for vulnerabilities.
- Policy Makers: The findings provide evidence for regulations that require clear disclosure of region‑specific data collection practices.
Limitations & Future Work
- Geographic coverage: The study focused on a subset of countries (primarily North America, Europe, and a few Asian markets); additional regions may reveal further disparities.
- Dynamic behavior: Static analysis cannot capture runtime feature toggles that activate only under certain network or locale conditions.
- Causality: The paper does not determine whether differences stem from intentional market tailoring, legal compliance, or inadvertent developer error.
Future research could extend the pipeline to cover more locales, incorporate dynamic instrumentation to observe runtime differences, and explore the business motivations behind regional customizations.
Authors
- M. Alecci
- P. Jiménez
- J. Samhi
- T. Bissyandé
- J. Klein
Paper Information
- arXiv ID: 2511.21151v1
- Categories: cs.SE
- Published: November 26, 2025
- PDF: Download PDF