Spatio-temporal Analysis of COVID-19 Hotspots in India Using Geographic Information Systems

A. Kanav
B. Yadav
R. Sharma
J. Kumar

The aim of this study is to identify hotspot regions of COVID-19 in India from March 2020 to August 2023. Identifying hotspots is essential for effective pandemic management, as it helps policymakers understand the dynamics of virus spread and allows for more precise public health campaigns. The present study is a district level analysis of India at five different points in time, where we calculate the cumulative incidence rate (CIR), cumulative fatality rate (CFR) and recovery rate (RR) for COVID-19. Further, we apply Global Moran’s I, Getis-Ord Gi* and Anselin local Moran’s I index to identify COVID-19 hotspots by using Geographic Information Systems (GIS) technology. The results show that the spatial and temporal variation of the CIR is very high across India. The CIR was recorded lower in May 2020 as the affected people were immobilized due to the lockdown. However, the CFR was high and RR was low due to inadequate medical facilities and treatment. The findings revealed that mainly two hotspot regions existed in India until May 2021, the National Capital Region, Haryana, Punjab, Rajasthan, Uttar Pradesh and Maharashtra in the south. However, this scenario has entirely changed since January 2022, when northern India has changed into a cold-spot and the southern coastal states have become the pandemic hot-spot region. Combining hot-spot analysis with Global & Anselin local Moran's I offers a precise method for locating statistically significant COVID-19 case cluster areas and identifying high-risk areas.

Spatio-temporal Analysis of COVID-19 Hotspots in India Using Geographic Information Systems

Kanav, A.,1 Yadav, B.,1 Sharma, R.2and Kumar, J.1*

1Department of Geography, School of Basic Sciences, Central University of Haryana, India

2Department of Geography, KLP College, Rewari, India,

*Corresponding Author

Abstract

The aim of this study is to identify hotspot regions of COVID-19 in India from March 2020 to August 2023. Identifying hotspots is essential for effective pandemic management, as it helps policymakers understand the dynamics of virus spread and allows for more precise public health campaigns. The present study is a district level analysis of India at five different points in time, where we calculate the cumulative incidence rate (CIR), cumulative fatality rate (CFR) and recovery rate (RR) for COVID-19. Further, we apply Global Moran's I, Getis-Ord Gi* and Anselin local Moran's I index to identify COVID-19 hotspots by using Geographic Information Systems (GIS) technology. The results show that the spatial and temporal variation of the CIR is very high across India. The CIR was recorded lower in May 2020 as the affected people were immobilized due to the lockdown. However, the CFR was high and RR was low due to inadequate medical facilities and treatment. The findings revealed that mainly two hotspot regions existed in India until May 2021, the National Capital Region, Haryana, Punjab, Rajasthan, Uttar Pradesh and Maharashtra in the south. However, this scenario has entirely changed since January 2022, when northern India has changed into a cold-spot and the southern coastal states have become the pandemic hot-spot region. Combining hot-spot analysis with Global & Anselin local Moran's I offers a precise method for locating statistically significant COVID-19 case cluster areas and identifying high-risk areas.

Keywords: Anselin Moran's I, COVID-19, Getis-Ord Gi*, India, Spatial Autocorrelation

1. Introduction

COVID-19 has a vast geographic reach and has affected millions of people globally, leading to numerous fatalities [1]. Designated a pandemic by the WHO on March 11, 2020 [2], it prompted a global call for unity to safeguard the vulnerable populace. India, ranking second in total infected persons after the United States, faced challenges due to factors such as social distancing, demographics, and health infrastructure [3]. Additionally, the government's approach to mitigation strategies played a crucial role in influencing the spread of virus [4]. Even amidst lockdowns and closures, millions of people traveled back to their hometowns, navigating the challenges posed by the pandemic.

In August 2023, the global tally for COVID-19 infections stood at over 769 million, with a sobering count of more than 6.95 million fatalities. Throughout the course of the epidemic, the media outlets and online platforms have broadcast various narratives to depict the crisis in diverse ways [5]. In this discourse, the discipline of geography emerges as a vital player in understanding the spread of the pandemic. According to Bissell [6], geographical concepts regarding location and movement empower us politically to identify new forms of injustice and suffering, prompting reflections on how we collectively shape the future of a place. Furthermore, Schueller et al., [7] established connections between place, mobility, and the geopolitical dimensions of the COVID-19 pandemic. The application of GIS and spatial statistical techniques has proven instrumental in comprehending the spatial distribution and pinpointing the risk zones of the disease. Concurrently, the pandemic has prompted geographers to reassess their relationship with place and space, giving rise to the creation and propagation of knowledge [1].

Different methods, including autocorrelation, hotspot analysis, space-time scan and clustering analysis have been employed to illustrate the areal distribution of the COVID-19 pandemic.

Through these spatial statistical techniques, researchers gain insights into the spatial patterns, enabling planners to devise effective policies for minimizing the risk of the COVID-19 pandemic. The widely used Getis-Ord method serves as a valuable tool for portraying the spatial distribution of infectious diseases like hand-foot-and-mouth disease and dengue [8] and [9]. It proves effective in identifying clusters of high and low concentrations in geographical areas and has been utilized in simulating outbreaks of various diseases [10] and [11]. Numerous studies have delved into identifying areas affected by COVID-19 at global, regional, and local levels. The Getis-Ord method was recently widely used in hot spot detection of COVID-19 at the global level like in USA [12], China [13] and [14], Kazakhstan [15], Brazil [16], Iran [17], Bangladesh [18], Malatya Province, Turkey [19], Indonesia [20] and Finland [21].

In India, a number of studies spanning across different disciplines of epidemiology, public health and social sciences have been conducted to understand the spread of COVID-19 virus and its impact on people. A comprehensive study in the states of Tamil Nadu and Arunachal Pradesh [22] found that children and adults in resource-limited settings may transmit COVID-19 more readily to same-age individuals. Kumar [23] proposed a network-based model to predict the spread of COVID-19, considering human mobility patterns through migration and air travel data. The model could be used to estimate resource needs and identify high-risk routes. Some studies in India concerning the epidemic trend of COVID-19 transmission have been regarding the prediction and analysis of COVID-19 positive cases using deep learning [24] and data-driven models [25]. Acharya and Porwal [26] prepared a district-level vulnerability index for prioritizing resource allocation and increasing management efficiency across the country.

Other studies pertaining to the pandemic included studying the effect of lockdown on air quality [27] and an assessment of the short-term decline in anthropogenic emission of CO2 in India [28]. Country-wide lockdown as a product of COVID-19 was also deemed as a necessary evil by Shehzad et al., [29] as it proved to be beneficial in mitigating air pollution for India and surrounding countries. One study also found that people residing in areas riddled with air pollution were more susceptible to respiratory infections [30]. Scholarly interest was also shown in studies of migrants and livelihoods. As the lockdowns in India caused widespread economic hardship, 80% of households experienced reduced food intake, with 60% lacking essential funds and 30% resorting to loans [31]. The stranded migrant laborers had to suffer from a loss of livelihood and resultant unsafe living conditions and an increased risk of COVID-19 infection [32]. The study also found that most migrant laborers failed to benefit from government assistance. Arora et al. also found that there was a reduced willingness and prevalence of fear among male migrants to resume work [33].

Since gaining independence, India has grappled with various health challenges, including cholera, dengue, measles, smallpox, and recently, COVID-19. In response to this historical backdrop, the Government of India crafted national health policies in 1983, 2002, and 2017 [34]. The most recent policy, established in 2017, aimed at achieving universal healthcare and "health for all" [35]. The current COVID-19 pandemic has posed a formidable threat in terms of infections and fatalities in India. The existing healthcare policies, shaped by past experiences, have proven to be insufficient in addressing the unique challenges posed by COVID-19. Recognizing the need for additional measures, the Government of India implemented a sweeping lockdown to curb community transmission. The first lockdown was initiated on 24th March 2020, for 21 days, including restrictions on travel and social gatherings in both public and private spaces. Given the gravity of the situation, the lockdown was extended until 31st May 2020, across three subsequent phases. The affected areas were categorically classified into green, orange, and red zones based on the severity of the outbreak.

The swift implementation of a stringent lockdown had an instant impact, effectively slowing the spread of the virus in India. This proactive measure provided valuable time to prepare critical medical infrastructure, strengthen human resources and leverage technological advances. Despite the relatively low severity of the COVID-19 pandemic in India compared to many other developing countries, the crisis has exacerbated pre-existing economic risks [36]. Researchers expect that lockdown will have an adverse and lasting effect on the country's economy [37]. The underfunded public healthcare system has posed challenges to the country's pandemic management strategy, highlighting the need for comprehensive measures in the face of such global health emergencies. The Government of India established a National Task Force (NTF) for COVID-19, with a primary focus on initiating prompt pandemic research and investigations.

Recognizing the significance of technological innovations in policymaking, the government engaged researchers at all levels to mitigate the impact of the COVID-19 pandemic by launching funding through Department of Science and Technology [38]. The Prime Minister's Office (PMO) took the lead in establishing the 'Prime Minister's Citizen Assistance and Relief in Emergency Situations Fund' (PM CARES Fund) to address the challenges posed by the COVID-19 pandemic. PM-CARES was specifically designed to encourage small donations from people, showcasing the strength of collective participation in alleviating emergency situations. The Prime Minister also encouraged businesses and higher-income community to shoulder the economic needs of individuals from lower-income groups. In doing so, he urged them to refrain from reducing the salaries of those unable to render services due to their inability to come to the workplace on certain days. This call emphasized a sense of shared responsibility and solidarity during these challenging times.

2. Objective

The present study analyses the COVID-19 trends, cumulative incidence rate (CIR), cumulative fatality rate (CFR), recovery rate (RR) of the states and union territories of India (Figure 1). The study also evaluates the spatial distribution and spatial clustering patterns of confirmed and deceased COVID-19 cases at different five periods at the district level using Global Moran's I, Getis-Ord Gi* and Anselin local Moran's I index method to investigate the high-risk and low-risk clusters.

3. Materials and Methods

3.1 Data Extraction

The study covers the period May 31, 2020 to August 02, 2023 for based on Government of India data source available at www.incovid19.org, that is updated on daily confirmed, deceased, and recovered cases of COVID-19 in each state and union territory at the district level as depicts in Figure 2. The data was cross-checked with state government data from different states in India, where necessary.

3.2 Data Analysis/Methods

Four peaks of COVID-19 have been identified in Figure 1 based on confirmed. In addition to the peaks, we also analyze the latest situation, hence covering five different time-periods (Table 1). Cumulative incidence rate (CIR), cumulative fatality rate (CFR) and recovery rate (RR) are defined by equations 1 to 3:

Equation 1

Equation 2

Equation 3

3.3 Spatial Analysis

3.3.1 Global Moran's I (Spatial Autocorrelation)

Spatial autocorrelation is a statistical technique that can be used to unveil the spatial patterns of phenomena, such as the occurrence of COVID-19. Global Moran's I provides a single value for the entire dataset, measuring the overall degree of spatial clustering [39]. The analysis offers both graphical and numerical outputs. The graphical output visually represents the distribution pattern of the data, showcasing whether it's scattered, clustered, or follows a random pattern. On the numerical side, the analysis provides five key values: Moran's I Index, Expected Index, P-value, Z-score, and Variance. The I Index is calculated using equations 4 to 6:

Equation 4

Equation 5

Table 1: Hotspot identification and mapping justification

S. No.

Date

Highest Cases in a Day
(peak- wise)

Justification

1

31-05-2020

8,789

Highest cases in lockdown period

2

16-09-2020

97,860

Highest cases in the first peak

3

06-05-2021

414,280

Highest cases in the second peak

4

20-01-2022

347,254

Highest cases in the third peak

5

02-08-2023

60

Current Situation

Figure 1: Map of India

Figure 2: Daily total number of confirmed cases of COVID-19 in India from January 30, 2020 to July 31, 2023

Equation 6

Where:

I is the Moran's I.

so is a normalization factor.

wi, j is the spatial weights between locations i and j.

zi and zj are standardized values of the variable of interest at locations i and j.

E(I) is the expected value of Moran's I.

V(I) is the variance of Moran's I

Global Moran's I index values fall within the range of -1.0 to +1.0; a positive value (>0) indicates clustering i.e. similar features are clustered together; a negative value (<0) indicates a random distribution; and a value of zero (=0) indicates a dispersed pattern. This can be crucial in identifying high and low-risk clusters for further analysis, providing valuable insights into the spatial patterns of the data [40]. As an inferential statistic, the conclusions drawn from the analysis are contingent on the assumption being tested and any observed patterns or relationships are considered in light of the null hypothesis [41]. The null hypothesis in our study states that COVID-19 is distributed randomly across the country. Thus, Global Moran's I can be used to confirm the overall trend of COVID-19 spread. This initial assessment provides a broader picture of potential hotspots.

3.3.2 Getis-Ord Gi* (Hotspot analysis)

A hotspot is a region with a higher incidence concentration than a random distribution. Getis-Ord Gi* method was performed for hot spot and cold spot analysis of COVID-19 in India using Arc GIS 10.8 software. Hot spot analysis employs the Getis-Ord Gi* statistic to evaluate each feature in a dataset, identifying statistically significant spatial clusters. This technique assesses each feature by considering its proximity to neighboring features [42]. Z-scores and p-values serve as indicators of where these clusters of high and low values reside. While a feature with a high z-score may initially attract attention, it may not hold statistical significance given its surrounding features. To be deemed a statistically significant hotspot, a feature must not only exhibit a high score but also be surrounded by other features with similarly high scores. A statistically significant z-score arises when the sum of a feature and its adjacent features deviates considerably from the expected sum.

This deviation is substantial enough to rule out the possibility of a random chance. In the case of positive z-scores, higher values indicate more intense clusters, referred to as hot spots. On the contrary, negative z-scores with smaller values represent a cold spot i.e. clustering of low values [42].

To perform hotspot analysis, confirmed cases and death cases data from 31 May, 2020 to 02 August, 2023 were used. For each feature, the resulting z-score and p-value reveal where high and low values cluster. High positive values indicate hotspots, while high negative values indicate cold spots. High positive and negative z scores indicate more and less clustering, respectively. The Getis-ord local statistic is given by equation 7:

Equation 7

Where xiis the attribute value for feature j, wi,j is the spatial weight between feature i and j, n is equal to the total number of features. and S are defined in equations 8 and 9:

Equation 8

Equation 9

The Gi* statistic is a Z score so no further calculations are required.

3.3.3 Anselin local Moran's I (Spatial Cluster Outlier)

By employing Anselin Local Moran's I, researchers can visualize clusters and outliers of COVID-19 cases on a map [43]. This method assigns a z-score and p-value to each feature, indicating the statistical significance of its spatial clustering. A positive z-score signifies that a feature is surrounded by similar features (e.g., high COVID-19 incidence areas clustered together), while a negative z-score indicates that a feature is surrounded by dissimilar features (e.g., high COVID-19 incidence areas surrounded by low incidence areas).

The results are typically classified into four categories:

  • High-High Clusters: These areas are characterized by a high concentration of COVID-19 cases.
  • Low-Low Clusters: These areas exhibit a low incidence of COVID-19 cases.
  • High-Low Outliers: These areas have a surprisingly high incidence of COVID-19 cases amidst a generally low-incidence region.
  • Low-High Outliers: These areas have a surprisingly low incidence of COVID-19 cases amidst a generally high-incidence region.

Using both Global Moran I and Local Moran's I provides a complimentary perspective that helps to confirm the overall pattern identified by hotspot analysis and ensures that local findings are not isolated anomalies. Combining statistics strengthens the overall analysis and provides more confidence in the location of identified hotspots, ultimately leading to a more comprehensive and informed approach to COVID-19 hotspot detection and mitigation. This allows for targeted interventions and resource allocation to effectively contain the spread of the virus in specific hotspots.

4. Results

4.1 Trends of COVID-19 in India

On January 30, 2020, the first instance of COVID-19 in India was officially recorded in the state of Kerala [7]. The cases of the COVID-19 affliction have increased rapidly in the country since March 2020. A nation-wide lockdown was imposed from March 24, 2020 to May 31, 2020, the last day of nationwide lockdown-phase IV.

This period saw a massive movement of people who fled to their homes. Before the lockdown, people returned home from virus-infected parts of the world. These cases served as primary agents of transmission of the virus. The period of unlocking began on June 1, 2020. By mid-December, the total confirmed cases had reached above 10 million. The country's health sector was ill-prepared to handle massive numbers of COVID-19 infected people. Till August 02, 2023 more than 44,995,802 individuals have been infected with COVID-19 and 531,918 have suffered more than fatalities in India. The daily confirmed cases witnessed a wave of rise and fall, flattening at intervals from 2021 to the present.

4.2 Cumulative Incidence Rate (CIR)

The cumulative incidence rate is defined as the probability of the occurrence of an event at a specific time. It shows the probability of spreading infection by a COVID-19 positive person to an individual per lakh population. Table 1 shows the CIR situation in India at the five peak time periods identified in our study (Table 2). The spatial and temporal variation of CIR is very high and varies among different States/UTs across India. The CIR values increased during the peak period of COVID-19 in India. In the initial stage of the COVID-19 spread, the CIR for the states and UTs shows a lower probability of infection to an individual by an affected person. The CIR was recorded lower on May 31, 2020, because the movement of affected people from one place to another was much lower due to lockdown. However, it increased in its first (16th September, 2020) and second peak (6th May, 2021) due to the end of the lockdown period throughout the country and an increase in the number of infected people in the country.

Table 2: Cumulative Incident Rate (CIR)

Dates

Highest Values

Lowest Values

May 31, 2020

Maharashtra

28.74

Tamil Nadu

12.27

Jammu & Kashmir

11.04

Mizoram

0.0

Sikkim

0.15

Arunachal Pradesh

0.19

Sep 16, 2020

Goa

343.01

Maharashtra

237.22

Tripura

182.94

Bihar

10.83

Rajasthan

21.27

Gujarat

22.92

May 06, 2021

Sikkim

652.12

Goa

524.31

Kerala

472.57

Bihar

7.71

Uttar Pradesh

8.33

Madhya Pradesh

13.26

Jan 20, 2022

Goa

1433.31

Mizoram

761.78

Kerala

560.57

Bihar

21.35

Uttar Pradesh

41.72

Meghalaya

49.10

Aug 02, 2023

Nagaland

67.5

Punjab

4.0

Kerala

2.8

Chhattisgarh

0.0

Assam

0.0

Arunachal Pradesh 0.0

The CIR was lower on January 20, 2022 due to the decrease in the number of positive cases, vaccination and health facilities. The CIR on August 02, 2023 was recorded lower as compared to the previous peak. Analysis indicates that the CIR values were recorded highest in Nagaland followed by Punjab, Kerala, Tripura and Sikkim, whereas lowest in north-eastern states like Assam and Arunachal Pradesh during this phase. However, the values drastically increased during the first, second and third peaks. However, the highest values were recorded in the southern states of India during the said peak in Goa, Maharashtra, Tamil Nadu, Andhra Pradesh, Karnataka, Kerala, Sikkim and Mizoram whereas the lowest were recorded in less urbanized states like Bihar, Rajasthan, Uttar Pradesh, Madhya Pradesh (Table 2).

4.3 Cumulative Fatality Rate (CFR)

Cumulative fatalities were calculated for five time periods on the basis of highest confirmed cases ( 3). The CFR was very high in the initial stage of the outbreak of COVID-19 due to inadequate medical facilities and treatment. The first-highest cases were recorded on May 31, 2020. At that time, the mortality rate of COVID-19 was high. The highest CFR was recorded in Gujarat. Other states also had high CFR during this period. But with the passage of time, the mortality rate due to COVID-19 was slightly lower in the next peak, which was recorded on September 16, 2020. At that time, the highest CFR was recorded in Gujarat state. Because of the new discovery for the diagnosis of COVID-19, the recovery from COVID-19 was greatly improved, and the CFR was reduced during the next peak, which was recorded on June 6, 2021. The CFR was slightly lower with the passage of time. The next highest number of daily confirmed cases was recorded on January 20, 2022. The latest situation was taken on August 02, 2023 (Table 3). CFR during both periods was nearly the same and lower due to the high improvement in medical treatments for COVID-19. Due to fewer positive cases and more medical facilities, the CFR was substantially lower than in the early stages of COVID-19 at the present time.

4.4 Recovery Rate (RR)

Like the cumulative fatality rate, the recovery rate is also calculated for five periods of the highest daily confirmed cases of COVID-19. The recovery rate of COVID-19 patients was lower in the initial stage due to inadequate medical treatment. The first peak on the basis of daily confirmed cases was recorded on May 31, 2020. At that time, the recovery rate for COVID-19 was lower. Gradually, with the improvement in medical facilities, the recovery rate of COVID-19 increased in the next peak of daily confirmed cases, which was recorded on September 16, 2020. For many of the reasons stated above, the COVID-19 recovery rate increased in the following periods of highest recorded daily confirmed cases, which were recorded on May 6, 2021 and January 20, 2022, respectively. As of August 02, 2023, the overall recovery rate for COVID-19 was 99.8%, with state recovery rates ranging from 99.7% to 93.7% (Table 4).

/

Table 3: Cumulative Fatality Rate (CFR)

Dates

Highest Values

Lowest Value

May 31, 2020

Gujarat

6.18

West Bengal

5.76

Madhya Pradesh

4.33

Nagaland, Goa, Manipur, Sikkim, Tripura

0.0

Chhattisgarh

0.20

Assam

0.30

Sep 16, 2020

Punjab

2.97

Gujarat

2.77

Maharashtra

2.75

Mizoram

0.0

Nagaland,

Arunachal Pradesh

0.19

Assam

0.34

May 06, 2021

Punjab

2.40

Sikkim

1.72

Uttarakhand

Maharashtra

1.49

Arunachal Pradesh

0.30

Mizoram

0.39

Kerala

0.32

Jan 20, 2022

Punjab

2.44

Nagaland

2.07

Maharashtra

Uttarakhand

1.93

Mizoram

0.37

Arunachal Pradesh

0.48

Telangana

0.56

Aug 02, 2023

Punjab

2.4

Nagaland

2.2

Maharashtra

1.8

Mizoram

0.3

Arunachal Pradesh

0.4

Telangana

0.5

Table 4: Recovery Rate (RR)

Dates

Highest Values

Lowest Values

May 31, 2020

Mizoram

100

Punjab

87.80

Rajasthan

68.30

Sikkim

Nagaland

0.0

Uttarakhand

11.88

Assam

13.88

Sep 16, 2020

Bihar

91.16

Tamil Nadu

89.38

West Bengal

86.69

Chhattisgarh

48.52

Meghalaya

53.97

Himachal Pradesh

60.50

May 06, 2021

Tripura

92.92

Arunachal Pradesh

90.24

Manipur

89.81

Uttarakhand

67.84

Karnataka

70.15

Rajasthan

71.08

Jan 20, 2022

Andhra Pradesh

96.80

Manipur

96.58

Meghalaya

96.43

Goa

88.29

Gujarat

88.51

Uttarakhand

89.72

Aug 02, 2023

Mizoram 99.7

Arunachal Pradesh 99.6

Telangana 99.5

Nagaland 93.7

Punjab

97.4

Maharashtra

98.2

Table 5: Spatial autocorrelation of confirmed cases

Date

Moran's I

z-score

p-value

May 31, 2020

0.018115

05.700779

0.00000

September 16, 2020

0.071491

14.453898

0.00000

May 06, 2021

0.059778

12.900874

0.00000

January 22, 2022

0.112121

23.404749

0.00000

August 02, 2023

0.108244

22.768330

0.00000

4.5 Spatial Autocorrelation of Confirmed Cases

A positive Moran's I value greater than zero indicates a substantial level of clustering in the spatial autocorrelation study of COVID-19 incidence spanning from May 31, 2020 to August 02, 2023. In May 2020, Moran's value stood at 0.018, accompanied by a low Z-score of 5.7, both of which signify that spatial clustering of confirmed COVID-19 cases is statistically significant but very low. The intensification of COVID-19 clustering becomes evident as we observe the rapid escalation in Moran's index values from September 16, 2020 to January 22, 2022, correspondingly associated with high z-scores, implying an extreme spatial pattern. This finding suggests a significant preponderance of confirmed cases and an increase in spatial clustering. By August 02, 2023 there is a slight decline in Moran's I value, while still being significant. It signifies that there has been a reduction in the clustering of COVID-19 cases (Table 5). Throughout the study period, p-values are nil, suggesting that the observed spatial autocorrelation is not a case of random chance.

4.6 Spatial Autocorrelation of Deceased Cases

The spatial autocorrelation analysis of COVID-19 deceased cases between May 31, 2020 and August 02, 2023 reveals significant clustering, as indicated by a Moran's I value greater than zero (Table 6). In May, 2020 the Moran's I value is positive but very close to zero, indicating that deceased cases were slightly clustered. There is a slight increase in Moran's I value on September 16, 2020 with a correspondingly higher z-score, implying that there was a slight increase in clustering. On May 06, 2021 the Moran's I value and z-score dropped again. It signifies that up until May 06, there was minimal clustering of deceased cases.

However, there is a significant increase in clustering again by January 22, 2022, depicted by a Moran's I value of 0.102 and a high z-score of 22.44. By August 02, 2023 Moran's I value had somewhat decreased, but was still associated with a high z-score. This indicates a rise in COVID-19 deceased case clustering during the last time period compared to the peak observed between September 2020 and May 2021. The findings point towards the presence of fluctuating patterns of spatial clustering in COVID-19 incidence during the study period. Earlier, the clustering was small, but mid-2021 to early 2022 saw a surge. By August 2023, COVID-19 clusters for both confirmed and deceased had clearly been established, indicating a geographical hotspot in COVID-19 transmission (Figure 3).

4.7 Getis-Ord Gi* and Anselin local Moran's I Analysis of confirmed cases

Geits-ord Gi* statics were used to identify COVID-19 hot spots for confirmed cases and the results are shown in Figure 4 for different time periods.

Figure 3: Spatial Autocorrelation using Global Moran's I

Figure 4: Getis-ord Gi * analysis of confirmed cases

During the period of 31 May 2020, a high-value cluster of hot spots with a 99% confidence level was identified in the regions of Punjab, Haryana, Rajasthan, Gujarat and Maharashtra. As of September 16, 2020 a high-value cluster of hot spots with a 99% confidence level were identified in the north in the states of Haryana, Punjab and Uttarkhand, and in the south, mainly in the states of Maharashtra and Karnataka. Some low-value hot spot clusters have been observed in the outer region of a high-value hot spot zone.

Table 6: Spatial autocorrelation of deceased cases

Date

Moran's I

z-score

p-value

May 31, 2020

0.024600

06.913650

0.00000

September 16, 2020

0.059778

12.900874

0.00000

May 06, 2021

0.013220

03.865592

0.00011

January 22, 2022

0.102631

22.442644

0.00000

August 02, 2023

0.095347

19.959054

0.00000

Figure 5: Anselin's local Moran's I analysis of confirmed cases

During this period, a high-value cluster of cold spots (99% confidence) was detected in the country's north and northeast regions. As in the previous time period, high-value hot spot clusters were identified in the south and south-west regions in May 05, 2021, but certain districts showed low-value clusters with a 90% confidence level. By January 20, 2022 the cold spot cluster had disappeared from the northern states and high-value cold spots remained only in north-eastern region. High-value hotspots were again limited to the states of Maharashtra, Karnataka, Kerela and Tamil Nadu. On August 02, 2023, a high-value cluster with a 99% level of confidence had expanded to other districts of the southern and south-western states. High-value cold spots had established themselves in Uttar Pradesh, Madhya Pradesh and north-eastern states. They were surrounded by a moderate-value cold spot district.

Anselin's local Moran's I was used for identifying regions with COVID-19 clusters and outliers (Figure 5). High-low (HL) outliers were identified in a few districts of Orrisa, Bihar, Uttar Pradesh and West Bengal on May 31, 2020. Significant clustering was not found in the northern region of the country. The states of Gujarat and Maharashtra had a few high-high (HH) and various low-high (LH) outliers. Low-low (LL) clusters had majorly occupied areas of northern and north-eastern India on September 20, 2020. A few HL clusters were also found in this region. The south-western districts were mostly HH clusters, with some LH clusters in their outer surroundings. In the next time frame, the Local Moran's I statistic does not identify any major clusters in the majority of the northern region. Although some high-low clusters do exist in the states of West Bengal and Sikkim.

Figure 6: Hot spot analysis of deceased cases

For January 20, 2022 and August 02, 2023 the product of Local Moran's I is quite similar for confirmed cases. HH clusters are prevalent in the districts of south-western states. Some LH outliers can also be found in this region. The majority of the northern parts of the country were covered with LL clusters, with some HL outliers in-between.

4.8 Getis-Ord Gi* and Anselin local Moran's I Analysis of Deceased Cases

For identifying hotspots of deceased cases, Geits-ord GI* was applied (Figure 6). On May 31, 2020, a concentration of hotspots with 99% confidence was detected in Gujarat, Maharashtra, and Goa. The northern parts of the country had some low-value hotspots surrounded by high-value hotspots. Moderate-value coldspots were found in the state of Uttar Pradesh and low-value hotspots were found in the eastern parts of the country. By September 16, 2020, high-value hotspots had intensified over the states of Punjab, Haryana, Uttarakhand, Rajasthan and Maharashtra. The coldspot over Uttar Pradesh also became high-value with a 99% confidence level. The northern high-value hotspots spread over an even larger area on May 06, 2021 and the hotspots in south-western region remained persistent. High value coldspots emerged in the north-eastern states. The high-value hotspots spread to the southern states of Tamil Nadu, Karnataka and Kerela on January 20, 2022. High-value coldspots remained in the north-eastern states, with moderate value coldspots forming over Uttar Pradesh, Bihar and Jharkhand. By August 02, 2023 high-value coldspots had developed over most of northern and central India, whereas the south-western states remained hotspots of high-value.

Anselin local Moran's I index was used to identify clusters and outliers of deceased COVID-19 cases (Figure 7). HH clusters of deceased cases emerged in Gujarat and Maharashtra, surrounded by LH outliers, on May 31, 2020. LL clusters were identified in Uttar Pradesh, Bihar, Chhattisgarh, Jharkhand, Kerala, Himachal Pradesh and the north-eastern states. LL clusters had taken over most of northern India by September 16, 2020, with some HL outliers in between. On May 06, 2021, Local Moran's I does not show significant clustering in northern-most India, although some HL outliers were present. HH clusters remained in Maharashtra, surrounded by some LH outliers. Similar results were obtained for deceased cases on January 20, 2022 and August 02, 2023. HH clusters are prevalent in the districts of south-western states, surrounded by some LH outliers. The northern region of the country was covered with LL clusters, with the presence of a few HL outliers.

Figure 7: Anselin local Moran's I analysis of deceased cases

5. Discussion

This study provides an evaluation of spatial distribution of COVID-19 in India from the year 2020 to 2023, at five specific peaks. The Global Moran's I analysis confirmed the existence of clustering in India, although varying between different points in time. After the verification of the presence of clusters in COVID-19 confirmed and deceased cases, the study proceeded to ascertain the spatial positioning of these clusters. Getis-Ord Gi* statistic proved to be useful in computing the location of coldspots and hotspots of COVID-19 confirmed and deceased cases. At the beginning of the pandemic, north and south-western India was a major hotspot. By mid-2023 north India was majorly a coldspot whereas south India remained a hotspot, extending to southern states of Kerela and Tamil Nadu. It was also established that Anselin's Local Moran's I is helpful in ascertaining the nature of spatial clustering.

COVID-19 spread rapidly throughout India soon after its initial identification in Wuhan (China) in late December, 2019. Shortly thereafter, India recorded its first case in January, 2020 from Kerala and by May, 2020 hotspot had formed over the southern states. Eventually, the hotspot of COVID-19 also emerged near the northern capital region. Throughout the course of the pandemic in India, the north eastern states remained cold spots with LL clustering.

This observation can be attributed to the geographical isolation of this region, once the lockdown restrictions were executed. The region is also marked by lower population densities, which can prevent the easy transmission of the virus.

The maps of analyses showed hotspots and clustering which kept on changing with the passage of time. This evolving distribution pattern can be attributed to different reasons like the differences in risk factors of different areas, ability of each state to mitigate the propagation of the virus, and effective execution of lockdown measures. In addition to the social distancing norms and restriction on travel imposed by the central government, some states adopted their own safety measures. Kerela adopted a decentralized approach, involving local communities in COVID-19 control efforts. A network of dedicated COVID-19 hospitals was established, in addition to quarantine measures and contact tracing. Maharashtra, the state with high COVID-19 activity acts accordingly and increased the health infrastructure capacity, including beds, ventila-tors, and medical personnel. ‘eSanjeevani' initiative for telemedicine consultations was also launched. Madhya Pradesh initiated a ‘Roko-Toko' campaign meaning ‘stop & alert' to promote the wearing of face masks among people.

Some limitations can be identified in our study. We use data from the single source, i.e. incovid19.org. We fail to account for any unreported cases or deaths by COVID-19. The temporal analysis of our study is also reliant upon five different points in time (peaks) and is not regular or consistent. As COVID-19 situation evolved fairly quickly, ideally studying the month-month variations may be helpful in discerning any spatial propagation patterns. Keeping in mind the time-constraints and length of study, we opted for studying the temporal changes at five peaks. We also use cumulative COVID-19 cases for each time instead of only new cases that were reported; therefore, the clustering results might not reflect emerging outbreaks. As the healthcare system of different states is not similar, some states were more efficient in testing of COVID-19 than others. Therefore, hotspot or coldspot analysis can show biased results. Notwithstanding these limitations, the study presents an important back-dated analysis about spatial clustering and hotspots of COVID-19 incidence over the period of three years. Thus, this study shall offer a foundational approach for epidemiologists to inspect the beginning and spread of COVID-19 transmission for future research. Moreover, the study may also help the country to revise and enhance their health policies and interventions to combat any future viral outbreaks.

The COVID-19 spread in India continues to change given the emergence of new variants and vaccine mandates. Thus, this study provokes new research questions about the importance of timely hotspot and cluster identification and provides the groundwork for the forthcoming comparative spatial research in the coming years in India and also with different countries.

6. Conclusion

COVID-19, an international public health crisis, affected millions of people and caused fatalities worldwide. This study analyzed the spatial spread of COVID-19 infections and temporal variations in their distribution. The results of the study demonstrate considerable regional and temporal diversity in cumulative incidence rate (CIR) across Indian states/UTs. The CIR was lower on May 31, 2020, as lockdown curbed the spread of virus by isolating affected people. However, when lockdown ended, COVID-19 cases saw a new peak. The cumulative fatality rate (CFR) was high in the early stages of the COVID-19 outbreak due to limited medical facilities and treatment, lowering the recovery rate. In August 2023, the recovery rate from COVID-19 was 98.7%, which is significantly better than many developed countries.

The research also evaluated the spatial arrangement of COVID-19 cases in India over the period from 2020 to 2023, during five distinct peaks. Temporal analysis of COVID-19 incidence by spatial autocorrelation produced Global Moran's I values and z-scores for five different times. It revealed a peak in spatial clustering of both confirmed and deceased cases on January 20, 2022. The least clustering was found at the beginning of the pandemic, at its first peak on May 31, 2020. It was discovered that a significant positive spatial correlation of COVID-19 incidence existed between the adjacent districts. The Getis-Ord Gi* analysis identified the districts where COVID-19 incidents were higher (hotspot) or lower (coldspot) and cluster analysis complemented the study by establishing the location of COVID-19 clusters and outliers. Altogether, these analyses provide interesting results regarding the chronological evolution of the pandemic.

References

[1] Singh, N. and Kumar, J., (2021). Geographic Research on COVID-19 Pandemic: A Review. Population Geography:A Journal of the Association of Population Geographers of India , Vol. 43(2), 107-124.

[2] Cucinotta, D. and Vanelli, M., (2020). WHO Declares COVID-19 a Pandemic. Acta Bio Medica: Atenei Parmensis, Vol. 91(1), 157-160. https://doi.org/10.23750/abm.v91i1.9397.

[3] Aneja, R. and Ahuja, V., (2021). An Assessment of Socioeconomic Impact of COVID‐19 Pandemic in India. Journal of Public Affairs, Vol. 21(2). https://doi.org/10.1002/pa.2266.

[4] Bedford, J., Enria, D., Giesecke, J., Heymann, D. L., Ihekweazu, C., Kobinger, G., Lane, H. C., Memish, Z., Oh, M., Sall, A. A., Schuchat, A., Ungchusak, K. and Wieler, L. H., (2020). COVID-19: Towards Controlling of a Pandemic. The Lancet, Vol. 395(10229), 1015-1018. https://doi.org/10.1016/S0140-6736(20)30673-5.

[5] Castree, N., Amoore, L. and Hughes, A., (2020). Boundless Contamination and Progress in Geography. Progress in Human Geography , Vol. 44(3), 411-414. https://doi.org/10.1177/0309132520920094.

[6] Bissell, D., (2021). A Changing Sense of Place: Geography and COVID‐19. Geographical Research, Vol. 59(2), 150-159. https://doi.org/10.1111/1745-5871.12465.

[7] Schueller, E., Klein, E., Tseng, K., Kapoor, G., Joshi, J., Sriram, A., Nandi, A. and Laxminayaran, R., (2020). COVID-19 in India: Potential Impact of the Lockdown and Other Longer-Term Policies. The Center for Disease Dynamics, Economics & Policy. Available: https://onehealthtrust.org/news-media/blog/covid-19-india-potential-impact-of-the-lockdown-and-other-longer-term-policies/.

[8] Khormi, H. M. and Kumar, L., (2012). The Importance of Appropriate Temporal and Spatial Scales for Dengue Fever Control and Management. Science of the Total Environment. Vol. 430, 144-149. https://doi.org/10.1016/j.scitotenv.2012.05.001.

[9] Wang, H., Du, Z. and Wang, X., (2015). Detecting the Association between Meteorological Factors and Hand, Foot, and Mouth Disease Using Spatial Panel Data Models. International Journal of Infectious Diseases . Vol. 34, 66-70. https://doi.org/10.1016/j.ijid.2015.03.007.

[10] Kracalik, I. T., Blackburn, J. K. and Lukhnova, L., (2012). Analyzing the Spatial Patterns of Livestock Anthrax in Kazakhstan in Relation to Environmental Factors: A Comparison of Local (Gi*) and Morphology Cluster Statistics. Geospatial Health. Vol. 7(1), 111-126. https://doi.org/10.4081/gh.2012.110.

[11] Murad, A. and Khashoggi, B. F., (2020). Using GIS for Disease Mapping and Clustering in Jeddah, Saudi Arabia. International Journal of Geo-Information. Vol. 9(5). https://doi.org/10.3390/ijgi9050328.

[12] Maroko, A. R., Nash, D. and Brian, T. P., (2020). COVID-19 and Inequity: A Comparative Spatial Analysis of New York City and Chicago Hot Spots. Journal of Urban Health. Vol. 97(4), 461-470. https://doi.org/10.1007/s11524-020-00468-0.

[13] Liu, M., Liu, M. and Li, Z., (2021). The Spatial Clustering Analysis of COVID-19 and its Associated Factors in Mainland China at the Prefecture Level. Science of the Total Environment. Vol. 777.
https://doi.org/10.1016/j.scitotenv.2021.145992
.

[14] Wang, Q., Dong, W. and Yang, K., (2021). Temporal and Spatial Analysis of COVID-19 Transmission in China and its Influencing Factors. International Journal of Infectious Diseases. Vol. 105, 675-685. https://doi.org/10.1016/j.ijid.2021.03.014.

[15] Kuznetsov, A. and Sadovskaya, V., (2021). Spatial Variation and Hotspot Detection of COVID-19 Cases in Kazakhstan. Spatial and Spatio-Temporal Epidemiology . Vol. 39.
https://doi.org/10.1016/j.sste.2021.100430
.

[16] Alves, J. D., Abade, A. S. and Peres, W. P., (2021). Impact of COVID-19 on the Indigenous Population of Brazil: A Geo-Epidemiological Study. Epidemiology & Infection. Vol. 149. https://doi.org/10.1017/S0950268821001849.

[17] Lak. A., Maher, A. and Zali, A., (2021). A Description of Spatial-Temporal Patterns of the Novel COVID-19 Outbreak in the Neighborhood's Scale in Tehran, Iran. Medical Journal of the Islamic Republic of Iran. Vol. 35, 315-328. https://doi.org/10.47176/mjiri.35.128.

[18] Rahman, M. R., Islam, H. and Islam, N. (2021). Geospatial Modelling on The Spread and Dynamics of 154-day Outbreak of the Novel Coronavirus (COVID-19) Pandemic in Bangladesh Towards Vulnerability Zoning and Management Approaches. Modeling Earth Systems and Environment. Vol. 7(3), 2059-2087. https://doi.org/10.1007/s40808-020-00962-z.

[19] Zeren, F., Akbulut, S. and Ozer, A., (2022). Spatial Clustering and Hot Spot Analysis of the COVID-19 Pandemic in Malatya Province. Medicine . Vol. 11(3), 1030-1035. https://doi.org/10.5455/medscience.2022.05.105.

[20] Cahyadi, M. N., Handayani, H. H., Warmadewanthi, I. D. A. A., Rokhmana, C. A., Sulistiawan, S. S., Waloedjo, C. S., Raharjo, A. B., Endroyono, Atok, M., Navisa, S. C., Wulansari, M. and Jin, S., (2022). Spatiotemporal Analysis for COVID-19 Delta Variant Using GIS-based Air Parameter and Spatial Modeling. International Journal of Environmental Research and Public Health. Vol. 19(3). https://doi.org/10.3390/ijerph19031614.

[21] Siljander, M., Uusitalo, R. and Pellikka, P., (2022). Spatiotemporal Clustering Patterns and Sociodemographic Determinants of COVID-19 (SARS-CoV-2) Infections in Helsinki, Finland. Spatial and Spatio-temporal Epidemiology. Vol. 41, 100-493. https://doi.org/10.1016/j.sste.2022.100493.

[22] Laxminarayan, R., Wahl, B., Dudala, S. R., Gopal, K., Mohan B, C., Neelima, S., Jawarhar Reddy, K. S., Radhakrishnan, J. and Lewnard, J. A., (2020). Epidemiology and Transmission Dynamics of COVID-19 in Two Indian States. Science, Vol. 370(6517), 691-697. https://doi.org/10.1126/science.abd7672.

[23] Kumar, A., (2020). Modeling Geographical Spread of COVID-19 in India Using Network-Based Approach. Medrxiv, https://doi.org/10.1101/2020.04.23.20076489.

[24] Arora, P., Kumar, H. and Panigrahi, B. K., (2020). Prediction and Analysis of COVID-19 Positive Cases Using Deep Learning Models: A Descriptive Case Study of India. Chaos, Solitons & Fractals. Vol. 139, 100-117. https://doi.org/10.1016/j.chaos.2020.110017.

[25] Ghosh, P., Ghosh, R. and Chakraborty, B., (2020). COVID-19 in India: Statewise Analysis and Prediction. JMIR Public Health and Surveillance , Vol. 6(3). https://doi.org/10.2196/20341.

[26] Acharya, R. and Porwal, A., (2020). A Vulnerability Index for the Management of And Response to the COVID-19 Epidemic in India: An Ecological Study. The Lancet Global Health, Vol. 8(9), e1142-e1151. https://doi.org/10.1016/S2214-109X(20)30300-4.

[27] Mahato, S., Pal, S. and Ghosh, K. G., (2020). Effect of Lockdown Amid COVID-19 Pandemic on air Quality of the Megacity Delhi, India. Science of the Total Environment. Vol. 730.
https://doi.org/10.1016/j.scitotenv.2020.139086
.

[28] Parida, B. R., Bar, S. and Singh, N., (2021). A Short-Term Decline in Anthropogenic Emission of CO2 in India due to COVID-19 Confinement. Progress in Physical Geography: Earth and Environment,Vol. 45(4), 471-487.
https://doi.org/10.1177/0309133320966741
.

[29] Shehzad, K., Sarfraz, M. and Shah, S. G. M., (2020). The Impact of COVID-19 as a Necessary Evil on Air Pollution in India During the Lockdown. Environmental Pollution, Vol. 266. https://doi.org/10.1016/j.envpol.2020.115080.

[30] Mele, M. and Magazzino, C., (2021). Pollution, Economic Growth, and COVID-19 Deaths in India: A Machine Learning Evidence. Environmental Science and Pollution Research , Vol. 28, 2669-2677. https://doi.org/10.1007/s11356-020-10689-0.

[31] Kesar, S., Abraham, R., Lahoti, R., Nath, P. and Basole, A., (2021). Pandemic, Informality, and Vulnerability: Impact of COVID-19 on Livelihoods in India. Canadian Journal of Development Studies, Vol. 42(1-2), 145-164. https://doi.org/10.1080/02255189.2021.1890003.

[32] Rahaman, M., Roy, A., Chouhan, P., Das, K. C. and Rana, M. J., (2021). Risk of COVID-19 Transmission and Livelihood Challenges of Stranded Migrant Labourers During Lockdown in India. The Indian Journal of Labour Economics, Vol. 64(3), 787-802. https://doi.org/10.1007/s41027-021-00327-9.

[33] Arora, V., Chakravarty, S., Kapoor, H., Mukherjee, S., Roy, S. and Tagat, A., (2023). No Going Back: COVID-19 Disease Threat Perception and Male Migrants' Willingness to Return to Work in India. Journal of Economic Behavior & Organization, Vol. 209, 533-546. https://doi.org/10.1016/j.jebo.2023.03.017.

[34] Gauttam, P., Patel, N. and Singh, B., (2021). Public Health Policy of India and COVID-19: Diagnosis and Prognosis of the Combating Response. Sustainability, Vol. 13(6). https://doi.org/10.3390/su13063415.

[35] Jakovljevic, M., Timofeyev, Y. and Ranabhat, C., (2020). Real GDP Growth Rates and Healthcare Spending–Comparison between the G7 and the EM7 countries. Globalization and Health, Vol. 16(64), 1-13. https://doi.org/10.1186/s12992-020-00590-3.

[36] Goel, I., Sharma, S. and Kashiramka, S., (2021). Effects of the COVID-19 Pandemic in India: An Analysis of Policy and Technological Interventions. Health Policy and Technology, Vol. 10(1), 151–164. https://doi.org/10.1016/j.hlpt.2020.12.001.

[37] Sharma, G. D. and Mahendru, M., (2020). Lives or Livelihood: Insights from Locked-Down India due to COVID19. Social Sciences & Humanities Open , Vol. 2(1). https://doi.org/10.1016/j.ssaho.2020.100036.

[38] Debnath, R. and Bardhan, R., (2020). India Nudges to Contain COVID-19 Pandemic: A Reactive Public Policy Analysis Using Machine-Learning Based Topic Modelling. PloS One, Vol.15(9). https://doi.org/10.1371/journal.pone.0238972.t006.

[39] Zhang, H. and Tripathi, N. K., (2018). Geospatial Hot Spot Analysis of Lung Cancer Patients Correlated to Fine Particulate Matter (PM2.5) and Industrial Wind in Eastern Thailand. Journal of Cleaner Production, Vol. 170, 407-424. https://doi.org/10.1016/j.jclepro.2017.09.185.

[40] Shariati, M., Mesgari, T., Kasraee, M. and Jahangiri-Rad, M., (2020). Spatiotemporal Analysis and Hotspots Detection of COVID-19 using Geographic Information System (March and April, 2020). Journal of Environmental Health Science and Engineering , Vol. 18, 1499-1507. https://doi.org/10.1007/s40201-020-00565-x.

[41] Manda, S., Haushona, N. and Bergquist, R., (2020). A Scoping Review of Spatial Analysis Approaches Using Health Survey Data in Sub-Saharan Africa. International Journal of Environmental Research and Public Health , Vol. 17(9). https://doi.org/10.3390/ijerph17093070.

[42] Peeters, A., Zude, M., Käthner, J., Ünlü, M., Kanber, R., Hetzroni, A. and Ben-Gal, A., (2015). Getis–Ord's hot-and Cold-Spot Statistics as a Basis for Multivariate Spatial Clustering of Orchard Tree Data. Computers and Electronics in Agriculture , Vol. 111, 140-150. https://doi.org/10.1016/j.compag.2014.12.011.

[43] Sandar, E., Laohasiriwong, W. and Sornlorm, K., (2023). Spatial Autocorrelation and Heterogenicity of Demographic and Healthcare Factors in the Five Waves of COVID-19 Epidemic in Thailand. Geospatial Health, Vol. 18(1). https://doi.org/10.4081/gh.2023.1183.