Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions EXTRAS/homeworks_to_submit/1036785977/homework_05/void
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
homework 5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyPfaywRXEuKWu7dSDkj6SOx"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["# Homework 6 _ JSON\n","# Santiago Ruiz Piedrahita\n","\n","HOMEWORK\n","\n","En un notebook de jupyter desarrolle los siguientes pasos\n","\n","Descargue el JSON con la lista de paises del siguiente link:\n","https://datahub.io/core/country-list/r/data.json\n","y escoja aleatoriamente un país de Europa \n","(Ejemplo que es abajo usa Colombia pero con pais de europa es mas facil)\n","\n","Use los diferentes API endpoints de inspire-hep \n","https://inspirehep.net/ \n","\n","Para extraer la lista de investigadores de una institución de ese país en esa base de datos. \n","\n","\n","Para ello:\n","\n","A) Use el API de institutions para extraer la lista de instituciones del país, por ejemplo: colombia (mejor si es Europa, por completo de datos )\n","https://inspirehep.net/api/institutions?q=colombia\n","\n","\n","Para la primera institución con\n","\n","number_of_papers > 0\n","\n","\n","Obtenga el valor\n","\n","legacy_ICN:\n","\n","\n","Por ejemplo: \n","Colombia, U. Natl.\n","\n","\n","Si ninguna institución satisface la condición number_of_papers > 0, escoja de nuevo otro país aleatorio y repita el proceso\n","\n"],"metadata":{"id":"HHS4qx3S4I7B"}},{"cell_type":"code","source":["# librerias\n","import json\n","import requests\n","import numpy as np\n","import pandas as pd"],"metadata":{"id":"8xN7--Ws4GIj","executionInfo":{"status":"ok","timestamp":1669750684552,"user_tz":300,"elapsed":1978,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}}},"execution_count":1,"outputs":[]},{"cell_type":"code","execution_count":2,"metadata":{"id":"Jgp2qPIG3msx","executionInfo":{"status":"ok","timestamp":1669750825604,"user_tz":300,"elapsed":196,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}}},"outputs":[],"source":["# leemos los datos\n","data = open(\"data_json.json\")"]},{"cell_type":"code","source":["# eligiendo un pais aleatoriamente\n","pais = \"Estonia\""],"metadata":{"id":"4zm6MAtt57Cn","executionInfo":{"status":"ok","timestamp":1669750931602,"user_tz":300,"elapsed":193,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}}},"execution_count":3,"outputs":[]},{"cell_type":"code","source":["# veamos las instituciones\n","URL = \"https://inspirehep.net/api/institutions?q={}\".format(pais)\n","institucion = requests.get(URL)\n","\n","legacy_ICN = []\n","\n","for i in institucion.json().get('hits').get('hits'):\n","\n"," legacy = i.get('metadata').get('legacy_ICN')\n"," papers = i.get('metadata').get('number_of_papers')\n","\n"," if papers>0:\n"," legacy_ICN.append(legacy)\n","\n","legacy_ICN"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"xSNUMTIT57E-","executionInfo":{"status":"ok","timestamp":1669751239290,"user_tz":300,"elapsed":744,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}},"outputId":"d7422bff-2f76-4bd6-8e38-7f20203b4aa1"},"execution_count":4,"outputs":[{"output_type":"execute_result","data":{"text/plain":["['Unlisted, EE',\n"," 'Tartu, Inst. Phys.',\n"," 'Tartu, Inst. Astrophys.',\n"," 'Tartu Observ.',\n"," 'Estonian U.',\n"," 'Tartu State U.',\n"," 'Tallinn Polytechnic Inst.',\n"," 'Comp. Sci. Coll., Tallinn',\n"," 'Tallinn Pedagogical U.',\n"," 'Estonian Agricultural U.']"]},"metadata":{},"execution_count":4}]},{"cell_type":"markdown","source":["B) Con el API de literatura obtenga el JSON con los artículos de menos de 10 autores usando el \"legacy_ICN\" de la siguiente manera\n","\n","https://inspirehep.net/api/literature?sort=mostrecent&page=1&q=aff+Colombia,+U.+Natl.+and+ac+1->+10\n","\n","\n","aff: usa el valor de legacy_ICN\n","and: es un operador lógico\n","ac: establece los autores entre 1 y 10\n","\n"],"metadata":{"id":"HFWKqlSU4jSZ"}},{"cell_type":"code","source":["link = legacy_ICN[0].replace(' ','+')\n","lit =\"https://inspirehep.net/api/literature?sort=mostrecent&page=1&q=aff+{}+and+ac+1->+10\".format(link)\n","lit = requests.get(lit)\n","\n","\n","art = []\n","for i in lit.json().get('hits').get('hits'):\n"," art.append (i.get('metadata')) \n","\n","len(art)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"Ec0GkMKL4j5W","executionInfo":{"status":"ok","timestamp":1669752408672,"user_tz":300,"elapsed":1640,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}},"outputId":"4187cc06-9d49-43a8-eb90-67ad550bded6"},"execution_count":11,"outputs":[{"output_type":"execute_result","data":{"text/plain":["7"]},"metadata":{},"execution_count":11}]},{"cell_type":"markdown","source":["C) Para al menos un artículo de esa institución, extraiga el URL del perfil de cada autor de esa institución que se encuentra dentro del campo \"authors\" en \"record\" y luego en \"$ref\". Por ejemplo\n","\n","https://inspirehep.net/api/authors/1010271\n","\n","\n"],"metadata":{"id":"W4v0-tLf7qr8"}},{"cell_type":"code","source":["# extrayendo los autores\n","autores = []\n","for i in art[0].get('authors'):\n"," autores.append(i.get('record').get('$ref')) \n","\n","autores"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"VB3xU83y_Lge","executionInfo":{"status":"ok","timestamp":1669752425750,"user_tz":300,"elapsed":205,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}},"outputId":"0cf2b0dd-543c-44b8-c187-97a3c75e0277"},"execution_count":12,"outputs":[{"output_type":"execute_result","data":{"text/plain":["['https://inspirehep.net/api/authors/1259434',\n"," 'https://inspirehep.net/api/authors/1334455',\n"," 'https://inspirehep.net/api/authors/1274396']"]},"metadata":{},"execution_count":12}]},{"cell_type":"markdown","source":["D) Con cada uno de los datos del resultado del API para cada perfil construya una tabla con los siguientes columnas (puede que alguno de los datos no esté disponible): \n","Nombre Completo\n","Correo electrónico\n","posición más reciente (la primera que aparece en la lista \"positions\" del JSON) con su correspondiente:\n","rango \n","institución \n","fecha de inicio \n","fecha de finalización"],"metadata":{"id":"WluzuBT17quA"}},{"cell_type":"code","source":["# para guardar los datos\n","nombre = []\n","email = []\n","posicion = []\n","institucion = []\n","F_inicio = []\n","F_finalizacion = []"],"metadata":{"id":"bpHAgbnVA5PR","executionInfo":{"status":"ok","timestamp":1669754133615,"user_tz":300,"elapsed":230,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}}},"execution_count":18,"outputs":[]},{"cell_type":"code","source":["for i in autores:\n","\n"," info = requests.get(i).json().get('metadata')\n","\n"," name = info.get('name').get('value')\n"," nombre.append(name)\n","\n"," email_a = info.get('email_addresses')[0].get('value')\n"," email.append(email_a)\n","\n"," rank = info.get('positions')[0].get('rank')\n"," posicion.append(rank)\n","\n"," institution = info.get('positions')[0].get('institution') #institution where they had their last position\n"," institucion.append(institution)\n","\n"," start = info.get('positions')[0].get('start_date')\n"," F_inicio.append(start)\n","\n"," end = info.get('positions')[0].get('end_date')\n"," F_finalizacion.append(end) "],"metadata":{"id":"33dQeCPhA5R5","executionInfo":{"status":"ok","timestamp":1669754135979,"user_tz":300,"elapsed":1697,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}}},"execution_count":19,"outputs":[]},{"cell_type":"code","source":["# guardamos los datos\n","informacion = {\"Nombre\":nombre,\"Email\":email,\"Posicion\":posicion,\"Institucion\":institucion,\"F_inicio\":F_inicio,\"F_finalizacion\":F_finalizacion}\n","pd.DataFrame(informacion)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":143},"id":"97z-0HG7A5UW","executionInfo":{"status":"ok","timestamp":1669754499390,"user_tz":300,"elapsed":202,"user":{"displayName":"SANTIAGO RUIZ PIEDRAHITA","userId":"01504872925764674078"}},"outputId":"aa431f2f-0682-4679-f0f1-6ee6f9ac6b64"},"execution_count":20,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Nombre Email Posicion Institucion \\\n","0 Lewicki, Marek marek.lewicki@fuw.edu.pl None Warsaw U. \n","1 Vaskonen, Ville vvaskonen@ifae.es POSTDOC Padua U. \n","2 Veermäe, Hardi hardi.veermae@cern.ch None NICPB, Tallinn \n","\n"," F_inicio F_finalizacion \n","0 2020 None \n","1 2022 None \n","2 None None "],"text/html":["\n"," <div id=\"df-0b6bc66c-d6c2-42b6-8e9a-75da9ef36c11\">\n"," <div class=\"colab-df-container\">\n"," <div>\n","<style scoped>\n"," .dataframe tbody tr th:only-of-type {\n"," vertical-align: middle;\n"," }\n","\n"," .dataframe tbody tr th {\n"," vertical-align: top;\n"," }\n","\n"," .dataframe thead th {\n"," text-align: right;\n"," }\n","</style>\n","<table border=\"1\" class=\"dataframe\">\n"," <thead>\n"," <tr style=\"text-align: right;\">\n"," <th></th>\n"," <th>Nombre</th>\n"," <th>Email</th>\n"," <th>Posicion</th>\n"," <th>Institucion</th>\n"," <th>F_inicio</th>\n"," <th>F_finalizacion</th>\n"," </tr>\n"," </thead>\n"," <tbody>\n"," <tr>\n"," <th>0</th>\n"," <td>Lewicki, Marek</td>\n"," <td>marek.lewicki@fuw.edu.pl</td>\n"," <td>None</td>\n"," <td>Warsaw U.</td>\n"," <td>2020</td>\n"," <td>None</td>\n"," </tr>\n"," <tr>\n"," <th>1</th>\n"," <td>Vaskonen, Ville</td>\n"," <td>vvaskonen@ifae.es</td>\n"," <td>POSTDOC</td>\n"," <td>Padua U.</td>\n"," <td>2022</td>\n"," <td>None</td>\n"," </tr>\n"," <tr>\n"," <th>2</th>\n"," <td>Veermäe, Hardi</td>\n"," <td>hardi.veermae@cern.ch</td>\n"," <td>None</td>\n"," <td>NICPB, Tallinn</td>\n"," <td>None</td>\n"," <td>None</td>\n"," </tr>\n"," </tbody>\n","</table>\n","</div>\n"," <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0b6bc66c-d6c2-42b6-8e9a-75da9ef36c11')\"\n"," title=\"Convert this dataframe to an interactive table.\"\n"," style=\"display:none;\">\n"," \n"," <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n"," width=\"24px\">\n"," <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n"," <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n"," </svg>\n"," </button>\n"," \n"," <style>\n"," .colab-df-container {\n"," display:flex;\n"," flex-wrap:wrap;\n"," gap: 12px;\n"," }\n","\n"," .colab-df-convert {\n"," background-color: #E8F0FE;\n"," border: none;\n"," border-radius: 50%;\n"," cursor: pointer;\n"," display: none;\n"," fill: #1967D2;\n"," height: 32px;\n"," padding: 0 0 0 0;\n"," width: 32px;\n"," }\n","\n"," .colab-df-convert:hover {\n"," background-color: #E2EBFA;\n"," box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n"," fill: #174EA6;\n"," }\n","\n"," [theme=dark] .colab-df-convert {\n"," background-color: #3B4455;\n"," fill: #D2E3FC;\n"," }\n","\n"," [theme=dark] .colab-df-convert:hover {\n"," background-color: #434B5C;\n"," box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n"," filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n"," fill: #FFFFFF;\n"," }\n"," </style>\n","\n"," <script>\n"," const buttonEl =\n"," document.querySelector('#df-0b6bc66c-d6c2-42b6-8e9a-75da9ef36c11 button.colab-df-convert');\n"," buttonEl.style.display =\n"," google.colab.kernel.accessAllowed ? 'block' : 'none';\n","\n"," async function convertToInteractive(key) {\n"," const element = document.querySelector('#df-0b6bc66c-d6c2-42b6-8e9a-75da9ef36c11');\n"," const dataTable =\n"," await google.colab.kernel.invokeFunction('convertToInteractive',\n"," [key], {});\n"," if (!dataTable) return;\n","\n"," const docLinkHtml = 'Like what you see? Visit the ' +\n"," '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n"," + ' to learn more about interactive tables.';\n"," element.innerHTML = '';\n"," dataTable['output_type'] = 'display_data';\n"," await google.colab.output.renderOutput(dataTable, element);\n"," const docLink = document.createElement('div');\n"," docLink.innerHTML = docLinkHtml;\n"," element.appendChild(docLink);\n"," }\n"," </script>\n"," </div>\n"," </div>\n"," "]},"metadata":{},"execution_count":20}]},{"cell_type":"code","source":[],"metadata":{"id":"SWxLKEJvH65l"},"execution_count":null,"outputs":[]}]}
1 change: 1 addition & 0 deletions EXTRAS/homeworks_to_submit/1036785977/homework_06/void
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
homework 6