WebScrappingPractice

Storing codes for web scrapping practice using request package and beautifulSoup package in Python.

Practice 1. Obtain URL from Google

Please find file google.py for code

Practice 2. Extract all of the links from White House that point to the briefings and statements

Please find file whiteHouse.py for code

Practice 3. Obtain most of the races result from Hong Kong Jockey Club

Since the request package could not download any source code from the website, web automation tool "Selenium" is introduced, which is a web browser scripting package, mainly for scraping source code from the website. After the source code is found, BeautifulSoup is then used to obtain the information needed, which is every race result in every racing day available from the website. The information is then extracted to a csv file for further processing.

Please also download chromedriver.exe to the same folder where the code is placed, and install selenium to make the code work.

The running time of the script is longer than an hour since the code involves web browsing, therefore it is suggested to modify the code for a quicker test.

Please find file HKJCreadResult.py for code, and HKJCResults.csv for the result.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
HKJC.py		HKJC.py
HKJCResults.csv		HKJCResults.csv
HKJCreadResult.py		HKJCreadResult.py
README.md		README.md
bsObject.py		bsObject.py
chromedriver.exe		chromedriver.exe
google.py		google.py
whiteHouse.py		whiteHouse.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebScrappingPractice

Practice 1. Obtain URL from Google

Practice 2. Extract all of the links from White House that point to the briefings and statements

Practice 3. Obtain most of the races result from Hong Kong Jockey Club

About

Uh oh!

Releases

Packages

Languages

3LexW/WebScrappingPractice

Folders and files

Latest commit

History

Repository files navigation

WebScrappingPractice

Practice 1. Obtain URL from Google

Practice 2. Extract all of the links from White House that point to the briefings and statements

Practice 3. Obtain most of the races result from Hong Kong Jockey Club

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages