Why Sustainability metrics can hide real risks instead of revealing them
The Illusion of Progress in Sustainability Related Metrics
Sustainability metrics are constantly being recalculated, audited and reassessed. Countries move up in SDG rankings, companies achieve stronger ESG ratings, and dashboards are increasingly filled with green indicators. At a surface-level view, this suggests steady, measurable progress.
However, this is where our interpretations become skewed.
A higher sustainability score does not necessarily mean lower risk. In many cases, it simply means that weaknesses have been averaged out.
Composite sustainability metrics aim to simplify complex data into a single number.
That simplicity offers a more streamlined approach for communication, benchmarking, and comparison. However, it also creates a structural blind spot. When diverse indicators are compressed into a single score, trade-offs disappear, and critical risks can be diluted or even offset entirely.
Performance appears coherent, yet the underlying reality remains fragmented.
This article studies that distortion through two perspectives. First, national-level sustainability measurement using SDG data. Second, corporate-level ESG ratings and their limitations. Across both, the same pattern emerges. Aggregated scores do not just summarize performance. They modify how we see risk.

The Measurement Gap
The SDGs are designed to track progress toward broad societal goals such as climate action, education, and inequality. ESG ratings, by contrast, assess how companies manage environmental, social, and governance risks within their operations. One operates at the level of public systems, the other at the level of corporate decision-making.
Regardless of these differences, both rely heavily on aggregation. Complex, multidimensional realities are translated into structured indicators, which are then combined into composite scores.
At this point, the measurement gap begins to emerge.
What is measured becomes clean, comparable, and communicable. And sometimes that can lead to layered, uneven, and sometimes contradictory outputs. The more the system simplifies, the more it risks hiding the very tensions it is supposed to reveal.
When SDG Progress Meets Climate Reality
Hungary provides a clear example of this tension. Its SDG score has gradually increased, reaching around 69 and placing the country in the middle of the European ranking. Additionally, on the dashboard, more goals seem to appear “green,” suggesting that the country is moving in the right direction.

But this interpretation depends entirely on how the score is constructed.
The SDG Index is built through a two-stage averaging process. First, individual indicators are normalized and averaged within each goal. Then, all 17 goals are averaged again, each with equal importance, to produce a single final score. This structure creates a compensation logic. Strong performance in some areas can offset weak performance in others.
In practice, this means that progress in areas including economic development and social infrastructure can partially offset weaknesses in climate policy or environmental protection. The final score smooths these differences into a single, stable number.
The issue does not lie in the accuracy of the number, but rather in what it conceals.
When we look specifically at climate performance, a different picture emerges.
Hungary’s position in the Climate Change Performance Index (CCPI) places it among lower-performing countries, with particular criticism aimed at weak renewable energy targets, insufficient policy implementation, and continued support for environmentally harmful subsidies.
This creates a clear contradiction. The SDG framework signals moderate progress, while climate-focused analysis shows structural flaws. Both assessments are methodologically sound, yet they lead to very different interpretations.
What this reveals is not an error, but a limitation. The SDG score reflects distance to a defined set of targets across multiple domains. It does not portray the depth, urgency, or credibility of policy implementation in any single domain. As a result, climate risk can remain underrepresented within an otherwise stable overall score.
The Same Pattern in ESG Ratings
A similar dynamic can be observed at the corporate level through ESG ratings.
These ratings are designed to evaluate how companies manage environmental, social, and governance risks, often using formal frameworks and publicly available data.
In theory, this approach delivers a comprehensive view of corporate sustainability.
In practice, it frequently results in a systematic yet incomplete representation.
The collapse of Silicon Valley Bank (SVB) is a daunting example of how ESG ratings can skew reality. Prior to its abrupt collapse in March 2023, SVB held an MSCI ESG Rating of "A". This rating positioned them as slightly above average in managing ESG risks compared to peers. Its governance structures appeared sound in formal assessments, yet critical weaknesses remained, including limited risk management expertise at the board level. However, overall, on paper, its governance structures appeared sound.
Yet the company collapsed almost overnight.
.jpg)
This is where the distinction between “what” and “how” becomes critical. As highlighted by experts in the field, performance must be understood on two levels: what results a company delivers, and how those results are achieved. ESG ratings focus heavily on the “how.” But if the “what” collapses entirely, the relevance of that assessment comes into question.
Why did a company with a strong ESG profile fail so fundamentally?
The answer lies in the detail beneath the score. While the overall rating appeared stable, certain subcategories already signalled weaknesses. For example, gaps in senior-level risk expertise at the board level pointed to deeper governance issues. These signals existed, but they were diluted within the aggregate score because other areas performed strongly.
In other words, the rating did not ignore risk. It only averaged them out of the picture entirely.
At the same time, ESG ratings are not designed to assess core financial risks such as liquidity mismatches or asset concentration. These fall outside their scope. This reinforces an important point: ESG analysis does not replace financial analysis. It complements it.
When these frameworks are interpreted as standalone indicators of overall performance, the risk of misjudgment increases significantly.
This highlights a central limitation. ESG ratings are built largely on disclosures, policies, and observable structures. They can assess whether a company has systems in place, but they cannot completely capture how decisions are actually made within those systems. They evaluate frameworks, not outcomes.
This strengthens a crucial distinction. ESG analysis is complementary to financial analysis, not a substitute for it. When the two are conflated, the chance of misinterpretation increases significantly.
The Structural Problem: False Coherence
Across sustainability metrics, both the SDG and ESG frameworks exhibit the same structural blind spots.
Composite scores create the impression of coherence. A single number suggests alignment, stability, and overall performance. However, beneath that number, the reality is uneven.
Strengths and weaknesses coexist, but they are no longer equally visible.
Strong indicators compensate for weak ones, and critical risks can be delayed in recognition simply because they are averaged into a more extensive system.
This creates what can be described as false coherence. The system appears consistent, while the underlying signals remain contradictory.
The danger is not in the existence of these metrics. It resides in how they are interpreted. When a score becomes the conclusion rather than the starting point, it encourages analytical complacency.
Why Methodology and Materiality Matter
These distortions are not accidental. They are the direct result of methodological design choices.
Averaging methods, weighting systems, and the balance between quantitative and qualitative inputs all shape what a score represents. Arithmetic averaging, for example, allows indicators to compensate for one another, making it easier for strong performance to offset weaknesses.
This is precisely why concepts such as materiality and industry-specific weighting have become increasingly important.
Materiality frameworks aim to ensure that the most relevant risks carry the greatest weight. Instead of treating all indicators equally, they prioritize what truly matters within a given industry or context. For example, climate risk will carry significantly more weight in energy-intensive industries than in service-based sectors.
This does not eliminate the problem entirely. But it reduces the likelihood that critical risks will be overshadowed by unrelated strengths. In this sense, materiality is not just a reporting requirement. It can be viewed as a corrective mechanism.
Additionally, alternative approaches, such as geometric averaging, reduce this effect but are less intuitive and harder to communicate.
Similarly, the breadth of a framework affects its depth. SDGs prioritize comparability across a wide range of social goals, while ESG ratings prioritize scalability within companies and industries. Each choice introduces compromises between simplicity, accuracy, and usability.
Noticing these trade-offs is essential. Without such recognition, there is a risk that the score will be regarded as a complete representation rather than a partial one.
Toward a More Credible Interpretation of Sustainability Metrics
The solution is not to reject SDG or ESG frameworks. At the end of the day, both present valuable insights. However, the challenge at hand is to use them with a more critical eye.
A high sustainability score should not be interpreted as confirmation of strong performance. Instead, it should trigger further examination. Which areas contribute most to the score? Which weaknesses are being offset? What risks are left unaddressed?
A more robust approach focuses less on the headline number and more on the structure behind it. It examines performance at the level of individual goals or pillars, identifies areas where compensation may be occurring, and integrates multiple perspectives rather than relying on a single framework. This shift in perspective alters the understanding of sustainability. Rather than focusing solely on whether a score is high or low, it is more important to consider whether it accurately reflects the major risks.
Conclusion
Sustainability metrics function as powerful tools. When used correctly, they provide valuable insights, but only when their limitations are clearly understood.
They simplify, aggregate, and communicate. In doing so, they also shape perception.
A strong score can signal progress while masking vulnerability. A stable trend can hide underlying instability.
The central conclusion is clear, yet easily overlooked: A high sustainability score does not necessarily indicate strong sustainability performance across all aspects. Recognizing this distinction is not simply a technical detail. It is essential for educated decision-making in an environment where risk is increasingly complex, interconnected, and difficult to evaluate.

.png)
.png)
.png)
