Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions statvar_imports/india_ndap/ndap/India_LifeExpectancy_metadata.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
config,value
provenance_url,https://ndap.niti.gov.in/dataset/7375
header_rows,1
output_columns,"observationAbout,observationDate,value,variableMeasured,unit"
#places_within,country/IND
#place_types,"AdministrativeArea,AdministrativeArea1,AdministrativeArea2,State"
mapped_columns,3
mapped_rows,1
#input_rows,50
#debug,1
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
place_name,observationAbout,dcid
jammu and kashmir,observationAbout,wikidataId/Q1180
himachal pradesh,observationAbout,wikidataId/Q1177
punjab,observationAbout,wikidataId/Q22424
uttarakhand,observationAbout,wikidataId/Q1499
haryana,observationAbout,wikidataId/Q1174
delhi,observationAbout,wikidataId/Q1353
rajasthan,observationAbout,wikidataId/Q1437
uttar pradesh,observationAbout,wikidataId/Q1498
bihar,observationAbout,wikidataId/Q1165
assam,observationAbout,wikidataId/Q1164
West Bengal,observationAbout,wikidataId/Q1356
jharkhand,observationAbout,wikidataId/Q1184
odisha,observationAbout,wikidataId/Q22048
chhattisgarh,observationAbout,wikidataId/Q1168
madhya pradesh,observationAbout,wikidataId/Q1188
gujarat,observationAbout,wikidataId/Q1061
maharashtra,observationAbout,wikidataId/Q1191
kerala,observationAbout,wikidataId/Q1186
andhra pradesh,observationAbout,wikidataId/Q1159
karnataka,observationAbout,wikidataId/Q1185
tamil nadu,observationAbout,wikidataId/Q1445
telangana,observationAbout,wikidataId/Q677037
ladakh,observationAbout,wikidataId/Q200667
29 changes: 29 additions & 0 deletions statvar_imports/india_ndap/ndap/India_LifeExpectancy_pvmap.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
key,prop,val,,,,,,
life expectancy,measuredProperty,lifeExpectancy,value,{Number},populationType,Person,unit,Year
male,gender,Male,,,,,,
female,gender,Female,,,,,,
Total,gender,"""""",,,,,,
srcyear,observationDate,{Number},,,,,,
jammu and kashmir,observationAbout,wikidataId/Q1180
himachal pradesh,observationAbout,wikidataId/Q1177
punjab,observationAbout,wikidataId/Q22424
uttarakhand,observationAbout,wikidataId/Q1499
haryana,observationAbout,wikidataId/Q1174
delhi,observationAbout,wikidataId/Q1353
rajasthan,observationAbout,wikidataId/Q1437
uttar pradesh,observationAbout,wikidataId/Q1498
bihar,observationAbout,wikidataId/Q1165
assam,observationAbout,wikidataId/Q1164
West Bengal,observationAbout,wikidataId/Q1356
jharkhand,observationAbout,wikidataId/Q1184
odisha,observationAbout,wikidataId/Q22048
chhattisgarh,observationAbout,wikidataId/Q1168
madhya pradesh,observationAbout,wikidataId/Q1188
gujarat,observationAbout,wikidataId/Q1061
maharashtra,observationAbout,wikidataId/Q1191
kerala,observationAbout,wikidataId/Q1186
andhra pradesh,observationAbout,wikidataId/Q1159
karnataka,observationAbout,wikidataId/Q1185
tamil nadu,observationAbout,wikidataId/Q1445
telangana,observationAbout,wikidataId/Q677037
ladakh,observationAbout,wikidataId/Q200667
19 changes: 19 additions & 0 deletions statvar_imports/india_ndap/ndap/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# NDAP - India Life Expectancy

- source: `https://ndap.niti.gov.in/dataset/7375`,

- how to download data: The files can be downloaded using `python3 scripts/world_bank/worldbank_ids/download.py`.

- type of place: State.

- statvars: Health

- years: 1997 to 2020

- place_resolution: Places are resolved based on name.

### How to run:

`python3 donwload_script.py`

`python3 stat_var_processor.py --input_data=../../statvar_imports/india_ndap/ndap/input_files/India_LifeExpectancy_input.csv --pv_map=../../statvar_imports/india_ndap/ndap/India_LifeExpectancy_pvmap.csv --config_file=../../statvar_imports/india_ndap/ndap/India_LifeExpectancy_metadata.csv --output_path=../../statvar_imports/india_ndap/ndap/output/Life_expectancy`
2 changes: 2 additions & 0 deletions statvar_imports/india_ndap/ndap/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
url='https://loadqa.ndapapi.com/v1/openapi?API_Key=gAAAAABnrzyB1tiwHrggYMoK81fqeUFDhVWacu2CFsVRzMJ4A23xSK4Ov6JLZYHpVi3fQK2ikFB65Xo56PpgeISlZB0OEYfPltYSRIYRfPqb27TNDZd_AK-gdnEIbkyqVuaZCkO9RQYh44TG2mGZwaRC3Yo16s8Ks8-FXtTf2RsV5c-L243hAfGRI984Bed6UTUJMWmDe4Fh&ind=I7375_4&dim=Country,StateName,StateCode,Year,GENDER'
input_files='input_files/'
82 changes: 82 additions & 0 deletions statvar_imports/india_ndap/ndap/download_script.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 20 ('License');
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an 'AS IS' BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# How to run the script to download the files:
# python3 download_script.py
import json
import os
import pandas as pd
import sys
from absl import app
from absl import flags
from absl import logging
from google.cloud import storage
flags.DEFINE_string(
'config_file_path',
'gs://datcom-import-test/statvar_imports/india_ndap/ndap/config.json',
'Input directory where config files downloaded.')

_SCRIPT_PATH = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.join(_SCRIPT_PATH, '../../../util/'))
from download_util_script import _retry_method


def main(_):
all_data = []
_FLAGS = flags.FLAGS
config_file_path = _FLAGS.config_file_path
storage_client = storage.Client()
bucket_name = config_file_path.split('/')[2]
bucket = storage_client.bucket(bucket_name)
blob_name = '/'.join(config_file_path.split('/')[3:])
blob = bucket.blob(blob_name)
file_contents = blob.download_as_text()
try:
file_config = json.loads(file_contents)
url = file_config.get('url')
input_files = file_config.get('input_files')
except json.JSONDecodeError:
logging.fatal("Cannot extract url and input files path.")
page_num = 1
while True:
api_url = f"{url}&pageno={page_num}"
response = _retry_method(api_url, None, 3, 5, 2)
response_data = response.json()
if response_data and 'Data' in response_data and len(response_data['Data']) > 0:
keys = response_data['Data']
try:
if not isinstance(keys, list):
logging.fatal(f"Value for key 'Data' is not a list.")
except KeyError as e:
logging.fatal(f"Missing expected key '{e}' in the API response.")
break
# Considering the table id I7375_4 which is specific to the import.
for i in keys:
a = a=i['StateName'],i['Year'].split(",")[-1].strip(),i['GENDER'],i['I7375_4']['TotalPopulationWeight'],i['Year'].split(",")[-1].strip(),i['Year']
all_data.append(a)
page_num += 1
else:
logging.error(f"failed to retrieve data from page {page_num}")
break
if all_data:
df = pd.DataFrame(all_data, columns=['srcStateName', 'srcYear', 'GENDER', 'Life Expectancy', 'YearCode', 'Year'])
os.makedirs(input_files, exist_ok=True)
input_filename = os.path.join(input_files, 'India_LifeExpectancy_input.csv')
df.to_csv(input_filename, index=False)
logging.info("Data saved to India_LifeExpectancy_input.csv")
else:
logging.info("No data was retrieved from the API.")

if __name__=="__main__":
app.run(main)
26 changes: 26 additions & 0 deletions statvar_imports/india_ndap/ndap/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"import_specifications": [
{
"import_name": "India_LifeExpectancy",
"curator_emails": [
"support@datacommons.org"
],
"provenance_url": "https://ndap.niti.gov.in/dataset/7375",
"provenance_description": "Life expectancy of a person in India based on gender at state level.",
"scripts": [
"download_script.py",
"../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/India_LifeExpectancy_input.csv --pv_map='India_LifeExpectancy_pvmap.csv' --config_file=India_LifeExpectancy_metadata.csv --output_path=output/Life_expectancy"
],
"import_inputs": [
{
"template_mcf": "output/Life_expectancy.tmcf",
"cleaned_csv": "output/Life_expectancy.csv"
}
],
"source_files": [
"input_files/India_LifeExpectancy_input.csv"
],
"cron_schedule": "00 11 1,15 * *"
}
]
}
96 changes: 96 additions & 0 deletions statvar_imports/india_ndap/ndap/test_data/India_LifeExpectancy.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
observationAbout,observationDate,value,variableMeasured,unit
wikidataId/Q1180,2002,67.3,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2002,64.7,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2002,65.9,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2003,68,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2003,64.4,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2003,66,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2004,68.9,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2004,65.9,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2004,67.3,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2005,69.9,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2005,67.1,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2005,68.4,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2006,71,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2006,69.2,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2006,70,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2007,70.8,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2007,68.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2007,69.6,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2008,70.6,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2008,69,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2008,69.8,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2009,71.2,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2009,68.9,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2009,70,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2010,71.1,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2010,69.2,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2010,70.1,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2011,71.9,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2011,69.4,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2011,70.5,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2012,72.4,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2012,69.9,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2012,71,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2013,74,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2013,70.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2013,72,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2014,74.9,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2014,70.9,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2014,72.6,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2015,76.1,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2015,71.2,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2015,73.2,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2016,68.5,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2016,71.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2016,73.5,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2017,76.7,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2017,72.1,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2017,74.1,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2018,76.2,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2018,72.2,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2018,74,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2019,76.1,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2019,72.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2019,74.2,dcid:LifeExpectancy_Person,Year
wikidataId/Q1180,2020,76.3,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1180,2020,72.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1180,2020,74.3,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,1997,65.2,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,1997,64.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,1997,64.9,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,1998,65.5,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,1998,64.8,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,1998,65.2,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,1999,67.7,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,1999,64.8,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,1999,66.2,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2000,68.3,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2000,65.7,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2000,67,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2001,69.4,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2001,66.3,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2001,67.8,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2002,70.2,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2002,66.6,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2002,68.3,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2003,71.4,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2003,67.1,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2003,69.1,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2004,72,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2004,67.3,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2004,69.5,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2005,71.8,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2005,67.4,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2005,69.5,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2006,72,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2006,67.4,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2006,69.6,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2007,72.2,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2007,67.8,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2007,69.9,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2008,72.1,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2008,67.7,dcid:LifeExpectancy_Person_Male,Year
wikidataId/Q1177,2008,69.8,dcid:LifeExpectancy_Person,Year
wikidataId/Q1177,2009,72,dcid:LifeExpectancy_Person_Female,Year
wikidataId/Q1177,2009,67.7,dcid:LifeExpectancy_Person_Male,Year
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Node: E:India_LifeExpectancy->E0
observationAbout: C:India_LifeExpectancy->observationAbout
observationDate: C:India_LifeExpectancy->observationDate
value: C:India_LifeExpectancy->value
variableMeasured: C:India_LifeExpectancy->variableMeasured
unit: C:India_LifeExpectancy->unit
typeOf: dcs:StatVarObservation
Loading