COVID-19: Decoding Arogya setu app with Data Science and Artificial Intelligence

You are currently viewing COVID-19: Decoding Arogya setu app with Data Science and Artificial Intelligence

COVID-19 has put the world in a black box, and the earth has come to a no-movement standstill. We are fighting a pandemic that is one of its kind. The government, the health department, and the military are leaving no stone unturned to find a solution and save the world from the ghastly disease. Amidst all, science has taken the first seat to give mankind a solution to this severe complex equation.

We should be grateful to the technology which has lent us Data Science and Artificial Intelligence as a blessing. The only focus in the present case scenario is to flatten the curve. AI is being put into action around the world from finding the cause, tackling the effect to curating the remedy. The scope of Data Science courses for students and the reason why the world needs to adopt Artificial Intelligence is gaining validity now.

Table of Contents


  • Diagnosis through AI tools
  • Pre-treatment analysis through Data sets
  • Clinical trials through AI studying different protein patterns
  • Post-treatment analysis through Data sets
  • Research and future analysis through AI
  • Proposing new AI models and technological apps for self-diagnosis


Recently launched by the Government of India, Arogya Setu is a COVID-19 tracking app which helps people in raising an alert if there is a COVID-19 positive prospect anywhere near the vicinity. The app is released by National Informatics Centre ( Ministry of Electronics and Information Technology).

With the help of this app, people can remain aware of the possible risks around them. They can take precautionary measure beforehand.


Artificial Intelligence and Arogya Setu App.

With the help of GPS, Bluetooth technology and Artificial Intelligence, the app helps to alert citizen about any COVID-19 infected person to the closest of their proximity.

There exists nothing in this world without Data. Data Science has become an integral part of AI. We can categorise the processing of the app into three different stages.


The structured Data is of two types

  • Textual Data
  • Numerical Data

a) Textual Data

A Set of Thousands and Millions of Data is stored in the form of text. All these Textual Data are processed with NLP (Natural language processing). NLP is a subfield of Artificial intelligence which use Machine Learning Algorithm to process text and speech. Any textual information about the pandemic is processed through NLP. It employs various procedures like

  • Text case normalisation
  • Word tokenisation
  • Text stemming
  • Stop words
  • Punctuation
  • Contraction
  • Bag-of-words

This algorithm workflow is carried out on different Data sets patterns for making various useful observations like

  • Understanding the behaviour of the virus
  • Observing close anatomy of the symptoms
  • Curating information about the hotspot localities
  • Predicting the vulnerability of the next location to get affected

After predicting the contextual behaviour of Data through NLP, the text is mapped into the vector domain using BERT Algorithm.

BERT Algorithm(Google’s latest search Algorithm) has the ability to rule out ambiguity error in a text format.

For example,

1. I am sick of my friends

2. I am feeling sick due to Nausea and fever

The word “sick” has a different meaning for both the sentences, contextually. If the system starts storing every sentence which contains “sick”, it may get ended up with unnecessary, irrelevant information. The BERT algorithm Decides the relevancy of the term by looking around the other words surrounded by it. It first maps the text into vector domain and decides the context by the content cluster it forms by Medical language processing.

  • Sick, friends – Irrelevant Data
  • Sick, Nausea, fever, – Useful Data

b) Numerical Data

Numerical Data includes demographic populations, Number of males and females in the society, numeric location detail in latitudes and longitudes. Number of people travelled across the world, and travelling history is also recorded. The numerical data helps in finding some helpful conclusions with the help of Machine Learning Algorithm.


With all the Data information, we can now use technology to extract information. While GPS tracker helps in rooting the people, Machine learning algorithm helps in Data clustering where the assessment is done for further study.

It categorises information in clusters like

  • Cluster 1: The number of people travelled to the places which are susceptible to be the hotspots-> A travelling report concludes that people who can afford to travel abroad belong to affluent society are more vulnerable to catch COVID-19
  • Cluster 2: The locality where the number of people is infected is more->If there is a place where the number of people infected crosses the standard number, it can be termed as the hotspot.
  • Cluster 3: Dividing the demography by age-> An age cluster is formed for a particular locality, it divides the demography by age and helps in evaluating the risk of the locality turning into a hotspot.
  • Cluster 4: Generic symptoms cluster-> It compares the symptoms that the person is asked to fill with the standard symptoms list and evaluates the deviation. A group of people showing the same symptoms in the same locality has the potential of turning the place into a hotspot.


Taking all the Data into account, the Data prediction system work on the cluster of data to create data patterns and make a prediction to avoid future risks. After considering all the factors for a particular locality, it can allocate signs and raise the alarm around you. If you are staying closer to the risk-prone area, you can alert yourself and take precautionary measures.


The system is not static. The Data keeps changing every hour, every minute and every second. With Data that keeps coming in, the model keeps refreshing itself periodically. The application dashboard will keep updating you if you are under low-risk or high-risk proximity.

The trio technology combo of GPS+ Bluetooth+ Artificial Intelligence helps in providing you with the list of all COVID-19 help centres and their contact numbers. It also provides a list of DO’s and Dont’s as additional measures. To avail its advantage, you have to make sure the Bluetooth and GPS on your phone is always ON.


This is indeed a useful and necessary measure by the authorities to come up with a tool to spread awareness among people in such time of tragic severity. This app has been proven to be useful by Millions of people who have already downloaded it and are using it. One good thing about this app is it doesn’t breach the privacy policy of users. The Government of India only can only access your Personal Data to inform you regarding the pandemic.

The existence of Data Science and Artificial Intelligence is itself a boon to the mankind. No doubt 7 out of 10 aspirants wants to be a Data Scientist in the future. The opportunity it holds for the generation to come is immeasurable. It will be beneficial if more students are looking up to it for a good future to help mankind. Data Science is indeed a mystical course which has the ability to fascinate the prospective students.  

It is a good time not to behold your aspirations just because the Quarantine has made you sit at home. You can still exploit the possibilities of this beautiful course by enrolling into various learning platforms who are lending information through various online Data Science courses.

Apart from learning, make sure you stay at home and be safe. Let’s cooperate Government, Medical staffs and Technology to fight this pandemic by strictly following the guidelines. Remember, we can think of a better future, only when it exists.


Monica is a senior marketing executive. Her skillsets consist of digital marketing and strategy, SEO, marketing analysis and more. She also has her expertise in writing various copies, including web, newsletters, e-books, social media, etc. But, it does not stop here. Her love for writing goes as far as doing poetry connecting science and life.

Monica Swain

Monica is a senior marketing executive. Her skillsets consist of digital marketing and strategy, SEO, marketing analysis and more. She also has her expertise in writing various copies, including web, newsletters, e-books, social media, etc. But, it does not stop here. Her love for writing goes as far as doing poetry connecting science and life.
Close Menu

Download Brochure

Download Brochure

Download Brochure