Land Cover Cloud Analytics: from Global Services to Regional Insights

Strobl, J.1,2* and Nazarkulova, A.1

1 Department of Geoinformatics, University of Salzburg, Schillerstraße 30, 5020 Salzburg, Austria


2 Austrian Academy of Sciences, Commission for Geographic Information Science, Austria

*Corresponding Author



Land Cover (LC) analyses and quantitiative as well as multi-thematic balances of (land use and) land cover are well established steps when identifying biogeoclimatic zones, estimating the potentials for human uses or habitat suitability, explore climate change impacts over time or dig deeper into the extent and co-location of specific categories like desert, forest, glaciers etc with determining factors in topography, climate or human impacts. While there are innumerable examples of land cover analysis in a range of projects at local scales covering catchments, smaller administrative districts or planning regions, a ‘big picture’ approach exploring national to global scales typically was constrained by the lack of easy access global data sets at high spatial resolution, and the resulting computation load hardly manageable on personal workstations. The recent availability of a variety of land cover services based on full and regular remote sensing coverage with automatic extraction of LC through deep learning approaches, in combination with geospatial cloud computing facilities enable researchers to leverage native (sensor) resolution analysis without the hassle of data download, preparation and local computational loads, as first implemented in Google Earth Engine. This paper supports this point by demonstrating LC analysis against topographic variables for the entire country of Kyrgyzstan. This kind of insights will lead to a better understanding of spatiotemporal LC dynamics and inform policy decisions from national to global levels.

Keywords: Land Cover, Global Data Sets, Spatial Analysis, Monitoring Climate Change, Land Cover Change

1. Introduction

Land cover serves as an important constraint for all kinds of land use and human activities, and as an indicator for natural as well as societal dynamics on Earth’s surface. Mapping and monitoring land cover (LC) is a key objective of space-based earth observation (EO). Since the beginning of regular global satellite imagery coverage with the Landsat platforms, huge repositories of EO data have been built up, recently complemented by a range of other platforms, e.g. ESA’s Sentinel family.

Many of these EO archives now are available under open licenses, facilitating the development of derivative data products like global land cover as presented by Tsendbazar et al., in [1] and Mora et al., in [2]. Original imagery as well as data products have enormous storage requirements and are subject to continuous expansion, enhancement and updating. Cloud environments therefore are the only feasible architectures where land cover archives and data products can be maintained and made readily accessible for users worldwide.

Land cover, as just one example of EO based products, is being widely used for mapping and establishing context for spatial planning [3], environmental monitoring, and land use decisions. In addition, it serves as an integrated indicator for various spatial processes, and as such is of interest whenever a deeper assessment and understanding of regional geographies is required. Spatial analysis aims at generating information in support of decision making, and this information created from a ‘densification’ of data is the main intended output from geospatial methods and tools. This study aims at demonstrating the power of cloud based spatial analysis, leveraging the online availability of EO based land cover data products. It uses the ‘World Land Cover 30m BaseVue’ data set originally created and published by MDA, and available inter alia through Esri’s Living Atlas cloud infrastructure.

Land cover (LC) not only is an integrative indicator for factors like climate, substrate and the interaction of ecological processes, but also is well suited to monitoring change induced by global (climate) processes [4] as well as regional human activities. Land cover also is a factor determining the local potential for human habitation and economic activities, including its role in complex biogeo climatic feedback cycles. While in extreme cases LC is essentially ‘nil’ – considering barren rock or soil or the special case of water, snow and ice, in most cases it requires a scale-dependent and systems-oriented definition. This is particularly true for multi-level vegetation covers as well as seasonal agricultural patterns. Like most mapping tasks, the schema or classification employed to identify area categories depends on the purpose of the resulting representation. This paper is entirely focused on remote sensing based and thus phenological acquisition [5] [6] somewhat restricting the types of LC units which can be distinguished. On the other hand, this approach facilitates work with globally homogeneous seamless data sets easily available through online services. The work performed towards this paper aims at determining the analytical potential and constraints for regionalized insights into LC distribution and dynamics. These again are considered highly relevant and factually indispensable for the monitoring of impacts from direct human action as well as climate change. This literally ‘top-down’ approach of global land cover monitoring is demonstrated and discussed using a Central Asia case study within the borders of the Kyrgyz Republic.

2. Materials and Methods

2.1 Study Area

This exploratory study was conducted covering the territory of the Kyrgyz Republic in Central Asia (Figure 1). As an approx. 200000 km² country with a rich topography spanning different climates from sub-mountain, mountain, high-mountain and nival zones, it is home to a correspondingly diverse set of land cover categories well suited for exploring the performance and quality of global LULC data sets. From an agricultural perspective Park et al., [7] explore the cropland suitability of Kyrgyz lands, while other studies like Liu et al., [8] include climate change scenarios. The general methodology of correlating land cover with other spatial variables however is considered transferable to any other region.

2.2 Objectives

Analyzing the distribution of land cover and its correlation with topographic parameters at a ground resolution of 25m pixels corresponds to the original spatial resolution of imagery as well as digital elevation models (DEM), and at the same time demonstrates the power and potential of extending this analysis over huge swaths of land and potentially the entire globe, instead of limited local study areas. Within this meta-objective of assessing and validating the global scope and analytical efficiency of creating geospatial insights from a combination of cloud-based data sets with fully scalable methods and tools, this case study explores the topographic patterns of land cover across the Kyrgyz Republic.

Figure 1: Topographic context of Kyrgyzstan as a study area

For the first time not only the balance of land cover categories for this country is established, but also LC distribution over topographic parameters. These examples are intended to showcase the potentials arising from combining, aggregating, and analyzing easily accessible global data sets with cloud-based geospatial methods.

2.3 Data Bases

This study is focused on (nearly) globally available online data sets accessible as online services and available for analysis. These products became feasible with the advent of sensors like MODIS, AVHRR and ETM which for the first time enabled the building of a ‘global view’ of the state and condition of Earth’s surface. The increase in open access to remote sensing date up to dekameter range created a decisive impulse for the move from project-based and regional needs-driven LULC classification towards global supply-oriented coverages. The ‘Global Land Survey’ [9] using Landsat ETM+ (after the full opening of Landsat archives, and based on earlier precursors) set the stage for the emergence of freely accessible worldwide LC data. It is important to mention, however, that automatic classification of remote sensing data is not the only path to land cover data. The European CORINE program initiated in 1985 started out with manual classifications of Landsat imagery into 44 categories with an original minimum mapping unit of 25ha and a focus on environmental monitoring [9]. Subsequently this was transitioned into the Sentinel – focused Copernicus Land Monitoring Service.

The geospatial cloud analytics case study presented in this data set is predominantly based on two global data sets, therefore the general approach would be applicable and reproducible essentially anywhere. These data sets represent land cover and elevation, respectively, and as due to their worldwide availability (except for polar regions) provide a seamless and complete coverage of Kyrgyzstan. The ‘World Land Cover 30m BaseVue’ data set [10], see Table 1, is a commercial product using a multitemporal, semi-automated supervised classifier ( The original capture dates were April 2014 to June 2014 with Landsat 8 with continuous updates until August 2020 for the data set used by the authors at Available as a premium service on the Living Atlas cloud platform ( like the other products discussed in this section it has a proven record of supporting analyses anywhere on the globe in the 25/30m resolution dimension. These LC data are accessible as an image service with query, identify, export and raster function capabilities and contain cell values according to the following class definitions.

Table 1: Land cover class definitions according to [10]


Class Name

Description (abbreviated)


Deciduous Forest

Trees > 3 meters in height, canopy closure > 35% (<25% intermixture with evergreen species) that seasonally lose their leaves, except larch


Evergreen Forest

Trees > 3 meters in height, canopy closure >35% (<25% intermixture with deciduous species), of species that do not lose leaves (includes coniferous larch)



Woody vegetation <3 meters in height, > 10% ground cover. Only collect > 30% ground cover.



Herbaceous grasses, > 10% cover, including pastureland. Only collect > 30% cover.


Barren or Minimal Vegetation

Land with minimal vegetation (<10%) including rock, sand, clay, beaches, quarries, strip mines, and gravel pits.


Agriculture, General

Cultivated cropland


Agriculture, Paddy

Cropland characterized by inundation for a substantial portion of the growing season



Areas where the water table is at or near the surface for a substantial portion of the growing season, including herbaceous and woody species (except mangrove species)



Coastal (tropical wetlands) dominated by mangrove species



All water bodies greater than 0.08 hectares (1 Landsat pixel) including oceans, lakes, ponds, rivers, and streams



Land areas covered permanently or nearly permanently with ice or snow



Areas where no land cover interpretation is possible due to obstruction from clouds, cloud shadows, smoke, haze, or satellite malfunction


Woody Wetlands

Areas where forest or shrubland vegetation accounts for greater than 20% of vegetative cover and the soil or substrate periodically is saturated with or covered by water.


Mixed Forest

Areas dominated by trees generally greater than 5 meters tall and greater than 20% of total vegetation cover. Neither deciduous nor evergreen species are greater than 75% of total tree cover.


High Density Urban

Areas with over 70% of constructed materials that are a minimum of 60 meters wide (asphalt, concrete, buildings, etc.).


Medium-Low Density Urban

Areas with 30% to 70% of constructed materials that are a minimum of 60 meters wide (asphalt, concrete, buildings, etc.).

Figure 2: BaseVue land cover service with enlarged sample detail

Figure 3: Land Cover categories within Kyrgyzstan - dominated by barren land, shrubs and grassland, with minor proportions of forests, agricultural land and water/snow/ice

For the presentation of analyses highlighted in this paper the classes 7 and 8 have been consolidated under ‘Agriculture’, the minimal amount of ‘Wetlands’ has been added to ‘Water’ and 20 and 21 were combined into ‘Urban’. Classes 10, 13, 14, and 15 were not present within the study area. Figure 2 shows the original view and standard symbology zoomed into the study area with a local more detailed sample around the Kyrgyz capital city Bishkek.

Statistically aggregating the LC classes on the BaseVue service within Kyrgyzstan’s boundaries shows more than 80% of the land mass covered by only three approximately equal categories, barren land, shrub / scrub and grassland. Only 6% are available for agriculture, and area exceeded by water / ice / snow. Forest cover is minimal, and built-up land is concentrated in a few major settlements.

While this kind of summary statistics is easily tabulated, for further analysis and the development of change scenarios it is important to allocate LC categories to specific topographic, climatic and human access contexts – this question will be addressed further below. The map in Figure 2 shows a lot of detail and serves as a starting point for high resolution multi-thematic analyses, but does not really provide a quick and crisp overview of spatial LC patterns across the country. For more effective communication of the ‘big picture’ a generalized view was created with a hexagonal standard mapping unit of 10km². Within each hexagon, the dominant ‘majority’ class was selected by zonal statistics and assigned a unique value according to the legend provided. While BaseVue 2013 has been used exclusively in the analysis presented below, multiple alternative data sets now are available with somewhat different characteristics and are referenced here for context: most recently, a ten class global land use/land cover (LULC) data set based on Sentinel 2 for the year 2020 has been generated at 10 meter resolution [11]. It distinguishes classes water – trees – grass – flooded vegetation – drops – scrub/shrub – built area – bare ground – snow/ice – clouds and claims an overall accuracy of 86% against a validation set [12]. It was produced by a deep learning model trained using over 5 billion hand-labeled Sentinel-2 pixels, sampled from over 20,000 sites distributed across all major biomes of the world. The underlying deep learning model uses 6 bands of Sentinel-2 surface reflectance data: visible blue, green, red, near infrared, and two shortwave infrared bands. To create the final map, the model is run on multiple dates of imagery throughout the year, and the outputs are composited into a final representative map of 2020 ( In addition, the processing workflow has been established in a way to allow future global as well as regional replication based on updated sensor data, thus facilitating updating and change detection. Acknowledging that the quest for global LC data sets is already dating back at least three decades, the original GLC can be considered a baseline data set which only much more recently has been made accessible as an online service. The table below provides a quick look at the evolution of monitoring global land cover since then (Table 2).

Figure 4: 10km² hexagons assigned the majority class within each hexagon

Table 2 : Overview of select global land cover data sets

Data set

Global Land Cover 1992-2019

LC BaseVue 2013

Global LULC 2020

Accessed at



Landsat 8



GCS WGS84 Web Mercator

Web Mercator







1992 ff




36 (hierarchical)

14/16 (US only)


The below presented exemplary analyses, however, are not predominantly focused on change monitoring and detection, even though this has been the main objective for the creation of the above introduced coverages. The authors rather want to emphasize the potential for analysis across thematic domains, like:

- LC distribution and dynamics across different morphometric features,

- LC relative to population densities and human activities,

- LC differences within and between national jurisdictions.

These kinds of cross-domain analyses are facilitated by global data sets similarly accessible like the above introduced LC services, in particular the Airbus WorldDEM used in subsequent analyses ( To work with topography across the study area in the following analyses, the global multi-resolution terrain elevation service from the Living Atlas was accessed at at the (25m) resolution of the WorldDEM data.

2.4 Data Analysis

Multi-thematic analysis long has been a mainstay set of methods within the field of spatial analysis. Exploring correlations and systematic interdependencies between coverages of spatial data is an important starting point for understanding spatial distributions of observations – like land cover. Having a clear picture of the impact of independent spatial variables like terrain, zonal climate factors and human action on land use and land cover ultimately helps with modeling distributions [13] as well as developing scenarios for anticipating change. Applicable overlay methods depend on the type of data involved. Metric data sets, like elevation vs precipitation lend themselves to simple correlation quantifying a degree of interdependence – although due to the typical presence of a high degree of spatial autocorrelation inferential statistics have limited value. At the core of exploratory analyses in this paper is the traditional map algebra approach of zonal analysis, generating descriptive statistics of a metric, continuous spatial variable like terrain elevation within the spatially discrete zonal categories of e.g. land cover.

Outlined already in the foundational book by Tomlin [14], zonal analysis per se obviously does not qualify as a novel approach in geospatial analysis. Within the context of this paper however it is applied to demonstrate the immense added value derived from two recent developments: the services-based open access to global data sets with high spatial and temporal resolution, and the emancipation of processing frameworks from personal workstations towards cloud computing. The combination of these developments significantly lowers the hurdles for exploratory analysis of land cover distributions and dynamics on e.g. national scales without having to go through the previously required enormous efforts of data preparation and staging of analyses. This evolution of analytical frameworks is tightly connected with the establishment of Spatial Data Infrastructures [15] [16] and Digital Earth twins [17], as demonstrated e.g. through the European INSPIRE Spatial Data Infrastructure presented by Minghini et al [2].

A spatial resolution of 25m was explicitly defined for all subsequent analysis steps, fully leveraging the nominal 25m resolution for terrain elevation and reasonably close to the 30m original Landsat ETM resolution underlying the BaseVue land cover service. Values of derivatives like slope therefore must be considered within the constraints of this resolution.


To demonstrate the power and potential of cloud-based multi-thematic analysis over large regions covering entire countries, the BaseVue 2013 land cover data set zones were analyzed against topographic variables at a 25m spatial resolution. For full appreciation of the data volume involved and the scale of these cross-tabulations of categorized data through zonal operations it shall be kept in mind that each layer (land cover, elevation and slope) include approx. 320 million data points each. The entire workflow has been implemented in the Esri ‘ecosystem’, leveraging data from the Living Atlas infrastructure, using ArcGIS Pro for analytical steps and ArcGIS Online for presentation including storymapping.

>3.1 Hypsometric Distribution of Land Cover Categories

The summary table presented as Figure 5 summarizes the zonal analysis of BaseVue LC classes against WorldDEM elevation, aggregated into 100m elevation steps and cut off above 5000m. The column labeled ‘HYPSOMET’ shows the rather unusual hypsometric curve of this country. A peak around 1600m elevation highlights the huge lake area of ‘Issyk Kul’ in the northeastern region, and the dominant elevations between 3000m and 3500m are characteristic for the large tracts of land with only very limited economic potential as seasonal pastures.

Figure 5: Hypsometric distribution of land cover categories across Kyrgyzstan

The coloured graphic bars within the chart represent the relative per-LC-class elevation distribution and have to be interpreted with reference to the total share of these classes as indicated in Figure 3. Agricultural uses and human settlements dominate only below 1500m elevation, from there a few forests and a lot of grassland and shrub in dryer areas take over with barren land dominating above 3500m followed by nival high mountain environments. This overview could of course be translated back into a multivariate selection map (not shown here for space reasons) highlighting e.g. low lying semi arid and currently unused areas which potentially could be made available for agriculture through irrigation. Or those shrublands can be identified offering opportunities for afforestation, or for conversion into productive grasslands. Clearly these would be naïve approaches requiring finetuning with additional factors and constraints to deliver any kind of useful policy recommendations. Climate variables, soils, feasibility of irrigation, access to markets and other criteria would have to be included with only slope inclination addressed as an additional factor below – as the main thrust of this paper aims at demonstrating the application of regional analysis through cloud computing and leveraging of online LC services, not pursuing specific development and policy questions.

3.2 Slope Patterns of Land Cover Categories

Topographic gradients, commonly referred to as ‘slope’ are another important constraint for the development of land uses and the assessment of regional potentials. In Figure 6 again the entire national land mass was classified in 5° slope categories using the 25m WorldDEM with direct use of server-side processing. The frequency distribution including sizable proportions of rather steep slopes – nearly half the country is in the steeper than 20° bracket. Again the slope frequencies in the respective columns have to be interpreted relative to those LC categories as quantified in Figure 3. Besides the obvious flat area and gentle slope preferences of settlements and agriculture (and of course water bodies) grassland due to its use as pastures dominates hilly slopes. Shrub and barren land depends less on slope than on substrate and humidity and therefore does not exhibit a clear slope frequency profile, with forests remaining on steeper slopes while ice and snow of course are determined mostly by elevation. Again, slope serves as but one factor in a multi-thematic land cover analysis and demonstrates the enormous benefits derived from openly and readily accessible global (digital terrain) data sets for a broad range of analyses.

Figure 6: Distribution of land cover categories across 5° slope classes. Second column from left represents overall slope frequency within the country

4. Discussion and Conclusions

The demonstration of integrated cloud-based analysis of openly available global information layers not only showcases the benefits of global monitoring from satellite platforms and highly automated semantic information extraction through AI, ML and DL, but also highlights the potentials of directly working with geospatial information through live web services instead of offline data sets decoupled from their sources. Spatial analysis is entering a new era through these dual developments allowing near real time insights largely independent from scale and spatial as well as temporal extents of research domains. Furthermore, even though only LC and DEM services have been used as demonstrators in this paper the practical impact of the cloud services and computation paradigm is of course not limited to these types of spatial information. To name just one other example of openly accessible data services with obvious interdependencies with the above we would like to emphasize the disaggregated and gridded world population layers [18] [19] [20] providing valuable insights into the densities and patterns of human habitation and impact.

Overall, we understand the above examples as a call for action to move from the currently still prevalent desktop-centric paradigm of analyzing static collections of spatial data towards dynamic analytical insights facilitated by online services and scalable cloud computing.


This research was supported by the Eurasia Pacific Uninet ( and the Austrian Academy of Sciences – Commission for GIScience, both working through the Austria – Central Asia Centre for GIScience (ACA*GIScience – in Bishkek.


[1] Tomlin, C. D., (1990). Geographic Information Systems and Cartographic Modeling. Englewood Cliffs, NJ, Prentice-Hall.

[2] Minghini, M., Cetl, V., Kotsev, A., Tomas, R. and Lutz, M., (2021). INSPIRE: The Entry Point to Europe’s Big Geospatial Data Infrastructure. In Handbook of Big Geospatial Data; Springer: Berlin, Germany, 619-641.

[3] Foley, J. A., DeFries, R., Asner, G. P., Barford, C., Bonan, G., Carpenter, S. R., Chapin, F. S., Coe, M. T., Daily, G. C., Gibbs, H. K., Helkowski, J. H., Holloway, T., Howard, E. A., Kucharik, C. J., Monfreda, C., Patz, J. A., Prentice, I. C., Ramankutty, N. and Snyder, P. K., (2005). Global Consequences of Land Use. Science, Vol. 309, 570–574.

[4] Lamarche, C., Santoro, M., Bontemps, S., d’Andrimont, R., Radoux, J., Giustarini, L., Brockmann, C., Wevers, J., Defourny, P. and Arino, O.,(2017). Compilation and Validation of SAR and Optical Data Products for a Complete and Global Map of Inland/Ocean Water Tailored to the Climate Modeling Community. Remote Sensing, Vol. 9(1),

[5] Anderson, J. R., Hardy, E. E., Roach, J. T. and Witmer, R. E., (1976). A Land Use and Land Cover Classification System for Use with Remote Sensor Data. U.S. Geological Survey Professional Paper 964.

[6] Price, J., (2003). Comparing MODIS and ETM+ Data For Regional and Global Land Classification. Remote Sensing of Environment, Vol.86, 491-499.

[7] Park, S., Lim, C. H., Kim, S. J., Isaev, E., Choi, S. E.; Lee, S. D. and Lee,W. K., (2021). Assessing Climate Change Impact on Cropland Suitability in Kyrgyzstan: Where are Potential High-Quality Cropland and the Way to the Future. Agronomy, Vol. 11(8),

[8] Liu,W., Liu, L. and Gao, J., (2020). Adapting to Climate Change: Gaps and strategies for Central Asia. Mitig. Adapt. Strat. Glob. Chang. Vol. 25, 1439–1459.

[9] Land Use Classification System as Presented in U.S. Geological Survey Circular 67. United States Government Printing Office, Washington, DC.

[10] Mora, B., Tsendbazar, N. E., Herold, M. and Arino, O., (2014). Global Land Cover Mapping: Current Status and Future Trends, Land Use and Land Cover Mapping in Europe , Springer Netherlands. 11-30.

[11] Hartley, A. J., MacBean, N., Georgievski, G. and Bontemps, S., (2017). Uncertainty in Plant Functional Type Distributions and its Impact on Land Surface Models. Remote Sensing of Environment, Vol. 203,.71-89.

[12] Pal, M. and Mather, P. M., (2005). Support Vector Machines for Classification in Remote Sensing. Int. J. Remote Sens.,Vol. 26, 1007–1011.

[13] Craglia, M., (2017). Spatial Data Infrastructures. International Encyclopedia of Geography: People, the Earth, Environment and Technology. Wiley.

[14] Tobler, W., Deichmann, U., Gottsegen, J. and Maloy, K., (1997). World Population in a Grid of Spherical Quadrilaterals. International Journal of Population Geography, Vol. 3, 203-225.

[15] Wang, C., (2017). Digital Earth. International Encyclopedia of Geography: People, the Earth, Environmentand Technology. Wiley.

[16] Copernicus Land Monitoring Service. CORINE Land Cover.

[17] Tsendbazar, N., de Bruin, S. and Herold, M., (2014). Assessing Global Land Cover Reference Datasets for Different User Communities. ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 103, 93-114.

[18] Smith, F. G. F., Bolton, C. and Jengo, C., (2004). The Classification of Hyperspectral Data using the CART Classification Approach. Proceedings of the ASPRS 2004 Annual Meeting. Denver, CO, USA.

[19] Tatem, A., (2017). WorldPop, Open Data for Spatial Demography. Sci Data. Vol. 4,

[20] Fritz, S., See, L., Perger, C., McCallum, I., Schill, C., Schepaschenko, D., Duerauer, M., Karner, M., Dresel, C., Laso-Bayas, J. C., Lesiv, M.,Moorthy, I., Salk, Carl, F., Danylo, O., Sturn, T., Albrecht, F., You, L., Kraxner, F. and Obersteiner, M., (2017). A Global Dataset of Crowdsourced Land Cover and Land Use Reference Data. Scientific Data, Vol. 4(1),