Data type, Data quality & Granularity
Data type, data quality and granularity are concepts that help categorize the accuracy, reliability and specificity of data. Starting with the former, there are three main data types to calculate emissions.
1. Default Data
- What it is: Pre-set, average values based on industry standards or regional estimates. This data doesn't account for specific variables but provides a baseline estimate for emissions.
- Example: Using an average emissions factor from for example the GLEC framework (e.g., 0.15 kg CO₂ per km for a diesel truck) regardless of the actual truck being used. This type of data is useful when specific information isn't available.
2. Modeled Data
- What it is: Data derived by applying formulas or models that estimate emissions based on general operational parameters. It’s more specific than default data but still relies on estimates.
- Example: Estimating the emissions of a delivery truck by considering its fuel consumption, route distance, and load factor (how full the truck is). For instance, you could model emissions based on a truck using 10 liters of fuel over 100 km at 50% load capacity.
3. Primary Data
- What it is: Actual, measured data from specific operations or vehicles. This is the most accurate data type because it reflects real-world usage.
- Example: Tracking the actual fuel consumption of a specific truck from a fuel meter during a particular trip (e.g., 9.8 liters of diesel consumed on a trip from London to Manchester). This data reflects real emissions and usage conditions.
Automatic classification of data types
By default, we assign a data type based on the input provided, calculation types are assigned a data type as follows:
Calculation Type | Data type |
---|---|
Fuel quantity | Primary |
Distance & average consumption | Modelled |
CO2e per shipment | Modelled |
CPI | Modelled |
Intensity | Default |
However, it can be that you have received data from a subcontractor which has been calculated using primary data, but the automatic classification notes it as modelled. This can be administered by using the column data_type within the 'emission asset' sheet by entering "primary".
Note: Distance & average consumption can only be classified as "primary" when both actual distance is used and measured fuel consumption (e.g. Average measured fuel consumption of the fleet over a year)
Granularity and Data quality
Based on data type and the granularity level of your data, we are able to assign a data quality level. The granularity level is categorized based on the time period over which emission is allocated, with daily being the most granular level, and yearly being the least granular level. If fuel quantity is used, we can determine this granularity level automatically. This is based on the date difference of all consignments that are linked to one emission asset.
For other calculation types, this can not be determined automatically and should be administered within the 'emission asset' sheet using the column data_granularity. Accepted values are Daily, Weekly, Monthly, Quarterly and Yearly.
Knowing which of these values should be entered requires some knowledge of the numbers used, lets take distance & average consumption as an example. If the KM/L number is based on the average consumption of a fleet over the course of a month the granularity would be "Monthly", if it is read out of a board computer it would be "Daily". while if it is the average of a fleet over a years time it would be "Yearly".
Note: Default Intensity calculation is always Bronze
Timeframe | Granularity | Data quality |
---|---|---|
1 day | Daily | Gold+ |
2-7 days | Weekly | Gold |
8 days - 31 days | Monthly | Gold |
32 - 93 days | Quarterly | Gold |
> 93 days | Yearly | Silver |
N/a | Default | Bronze |
So if one emission asset (BigTruck) get’s allocated over these 2 consignments transported on 2 different dates:
Consignment | Date | asset_id_1 |
---|---|---|
1 | 1-April-2024 | BigTruck |
2 | 1-June-2024 | BigTruck |
The date range of 1-4-2024 till 1-6-2024 = 61 days. This means the emission will be assigned a Quarterly granularity and a Gold data quality.
Use of data quality in practice
- The more granular a calculation, the more can be said about operational performance of consignments and trips.
- Lean & Green requires reporting of data quality and in some cases a minimum data quality level is required to be able to match requirements for achieving a “star”.