-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathBlogPost4.html
More file actions
1 lines (1 loc) · 5.31 KB
/
BlogPost4.html
File metadata and controls
1 lines (1 loc) · 5.31 KB
1
<html><head><meta content="text/html; charset=UTF-8" http-equiv="content-type"><style type="text/css">ol{margin:0;padding:0}table td,table th{padding:0}.c0{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.c3{color:#000000;font-weight:700;text-decoration:none;vertical-align:baseline;font-size:12pt;font-family:"Arial";font-style:normal}.c1{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:center}.c2{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c6{text-decoration-skip-ink:none;-webkit-text-decoration-skip:none;color:#1155cc;text-decoration:underline}.c7{background-color:#f9cb9c;max-width:468pt;padding:72pt 72pt 72pt 72pt}.c5{color:inherit;text-decoration:inherit}.c4{height:11pt}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:"Arial"}p{margin:0;color:#000000;font-size:11pt;font-family:"Arial"}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}</style></head><body class="c7 doc-content"><p class="c2"><span class="c6"><a class="c5" href="https://www.google.com/url?q=https://diogo-code.github.io/&sa=D&source=editors&ust=1679709424276869&usg=AOvVaw26xnC1xxBIHvorFdglLNfA">Return to main page</a></span></p><p class="c2 c4"><span class="c0"></span></p><p class="c2"><span class="c3">What has been done for the 2nd demo delivery:</span></p><p class="c2 c4"><span class="c3"></span></p><p class="c1"><span class="c3">Integrating Google Search Library</span></p><p class="c1"><span class="c0">We have implemented the Google Search Library. This allowed us to pull more than 400 websites from a search target list containing various agronomic search terms. Through the use of this library, the speed and accuracy of our search results have been increased.</span></p><p class="c2 c4"><span class="c0"></span></p><p class="c1"><span class="c3">Incrementing File Downloads</span></p><p class="c1"><span class="c0">From the pulled websites, specific file types were targeted for download, those are '.csv', '.xls', '.json', ‘.kml’, ‘.kmz’, and ‘.shp’. Additionally ‘.zip’ files were removed from the target list. This ensured that we only downloaded relevant files.</span></p><p class="c2 c4"><span class="c0"></span></p><p class="c2 c4"><span class="c0"></span></p><p class="c1"><span class="c3">Bypassing HTTP Request Search Limits</span></p><p class="c1"><span class="c0">We encountered some limitations with HTTP requests with google. It appears that google searches can have restrictions to prevent fast searches. To overcome this challenge, we introduced pauses in our code, allowing us to bypass this limitation.</span></p><p class="c2"><span class="c0">.</span></p><p class="c2"><span class="c3">Next Steps:</span></p><p class="c2 c4"><span class="c3"></span></p><p class="c1"><span class="c3">Refining the Search Name Target List</span></p><p class="c1"><span class="c0">To further improve the accuracy of our searches, we plan to add more names to the target list. This will help filtering out websites that do not meet our specific criteria.</span></p><p class="c1 c4"><span class="c0"></span></p><p class="c1"><span class="c3">Creating a Download Name Target List</span></p><p class="c1"><span class="c0">To generate accuracy towards agronomic data in download files, we plan on adding a filter that checks the name of the download file.</span></p><p class="c1 c4"><span class="c0"></span></p><p class="c1"><span class="c3">Preventing Duplicate Downloads</span></p><p class="c1"><span class="c0">To address the issue of duplicate download files, we will develop a check to prevent duplicates from downloading.</span></p><p class="c1 c4"><span class="c0"></span></p><p class="c1"><span class="c3">Adding Website Names to Downloaded Files</span></p><p class="c1"><span class="c0">Lastly, we want to add the respective website name to each downloaded file in the .txt file. This will add essential context for each download file and allow us to easily trace back to their source.</span></p></body></html>