Skip to main content

Data Quality Assessment

Evaluate how well your data sources fit their purpose and meet compliance standards.

The core objective of Data Quality Assessment (DQA) is to determine if a selected data source is "fit for purpose" based on its temporal, geographical, and technological relevance.

Evaluating these three pillars is a fundamental requirement under ISO 14040 and ISO 14044. It ensures that your background data accurately aligns with your specific modeling choices, which is essential for generating compliant Life Cycle Assessments (LCAs) and Environmental Product Declarations (EPDs).

Earthster provides a built-in DQA framework for every process within your cycles. To enable faster and more scalable model development, Earthster automatically calculates DQA scores based on your data selections, while giving you full control to manually override these scores at any time using standard compliance rubrics.

The DQA Scoring System

Earthster evaluates three key indices for each process flow. Each index is scored on a scale from 1 to 5 following the Product Environmental Footprint Category rules (PEFCR):

  • 1 - Very Good

  • 2 - Good

  • 3 - Fair

  • 4 - Poor

  • 5 - Very Poor

1. Temporal Correlation

This index measures how representative the added data is relative to the time period you are modeling. Earthster compares the date of your current cycle with the date of the consumed data source (the background process).

Note: By default, the date of your cycle is assumed to be its creation date. You can change this at any time in Cycle Settings -> Date.

Score

Rating

Time Difference

Earthster Automatic Default

1

Very Good

0 – 1 years

Automatically applied if the time difference is less than 1 year.

2

Good

1 – 3 years

Automatically applied if the time difference is between 1 and 3 years.

3

Fair

3 – 6 years

Automatically applied if the time difference is between 3 and 6 years.

4

Poor

6 – 10 years

Automatically applied if the time difference is between 6 and 10 years.

5

Very Poor

> 10 years

Automatically applied if the data vintage exceeds 10 years.

For background data, the dataset year is determined as follows:

  • ecoinvent
    The upper bound of the data range is used as the dataset year.
    Although this does not strictly represent the original reference year of the inventory, it best reflects the effective data vintage, especially since key system datasets (e.g., electricity mixes) are updated regularly in each release.

  • ILCD-based databases (e.g. PEF)
    The reference year provided in the dataset is used (typically corresponding to the lower bound of the time range).

  • JSON-LD databases (e.g. USLCI, CORRIM, BAFU)
    The valid_until field is used as the reference year (same rationale as that of ecoinvent).

  • EXIOBASE and USEEIO
    The year of the latest available data represented in the database is used (currently 2022 for exiobase, 2016 for USEEIO)

2. Geographical Correlation

This index assesses how well the geography of the data source matches the geography of the parent process (bundle, custom process, stage) it was added.

Score

Rating

Rubric Definition (When to choose)

Earthster Automatic Default

1

Very Good

Data from area under study

Both geographies are an exact match (e.g., both are "Portugal").

2

Good

Average data from larger area in which the area under study is included

The process geography is nested inside the data geography (e.g., Process is "Portugal", Data is "Europe").

3

Fair

Data from area with similar production conditions

Manual override only (e.g. Germany used as a proxy of France)

4

Poor

Data from area with slightly similar production conditions

For any other mismatched situation, or if the data source is "Rest of World" (RoW).

5

Very Poor

Data is from an unknown or distinctly different area (North America instead of Middle East, OECD-Europe instead of Russia)

Manual override only.

3. Technological Correlation

This index evaluates whether the dataset represents a good approximation of the actual technology or activity you intend to model based on the data source.

Note: While background databases are high-quality, your own cycles are considered a better technological match, therefore by default your own cycles get a score of 1 and everything else gets 2.

Score

Rating

Rubric Definition (When to choose)

Earthster Automatic Default

1

Very Good

Data from enterprises, processes and materials under study

Automated: If the data comes from your own custom cycles.

2

Good

Data from processes and materials under study (i.e. identical technology) but different enterprises

Automated: If the data comes from standard Background databases (e.g., ecoinvent).

3

Fair

Data from processes and materials under study but from different technology

Manual override only.

4

Poor

Data on related processes or materials

Manual override only.

5

Very Poor

Data on related processes on laboratory scale or from different technology

Manual override only.

Overriding Automatic Scores

While Earthster provides these scores automatically to save you time, you need to check and specify the correct scores for your specific case.

Data quality is defined as fitness for purpose in ISO 14040. The methodology is addressed in ISO 14040 and ISO 14044. And for example a data quality assessment is required for EPDs.

Specifying your own scores (overriding automatic scores)

  1. Open the cycle menu

  2. Click on the Data quality

  3. Click on any of the scores to overwrite it.

  4. Select the score you want to use

  5. Click the Save changes button to apply the score

  6. Overriden values are highlighted.

Checking and restoring the automatic scores

You can always check the calculated values compared to what you have selected by clicking on the current score.

Restoring the automatic scores

  1. In the Data quality section, click the score you want to restore

  2. Click on the Do not override button

  3. Click the Save changes button to apply the change

Exporting your data quality assessment matrix

You can download your data quality assessment matrix by exporting the "LCI data" report.

Did this answer your question?