Skip to content
This repository was archived by the owner on Apr 8, 2019. It is now read-only.
This repository was archived by the owner on Apr 8, 2019. It is now read-only.

Add FTP Functionality for daily schedules #15

@axsaucedo

Description

@axsaucedo

FTP Functionality for Daily Schedules

Having this functionality will be critical to be able to provide accurate results for cancellations and delays.

All schedules are not published throughout the day

After some research we found out that not all schedules are published throughout the day - but only the updated ones. This means that we will never be able to have an accurate ratio of cancelled vs fulfilled schedules, as we would never have the full number of fulfilled schedules.

Daily FTP Schedule update

In order to solve this problem, we will need to add all initial schedules at the beginning of each day. This will consist of downloading the schedule file for the day from the FTP server, and adding it to the database.

Schedules for the day might need different table

This is an assumption based on that schedules are updated when there is a delay. This is a fundamental flaw in the darwin system, as calling points are explicitly marked as cancelled when relevant, however this is not the case when delayed.
** This should be discussed **. If this assumption is correct, and schedule times are actually updated when trains are delayed, then we will need to store the daily schedules in a different table. Having schedules in a different table will allow us to compare them with their original state, allowing us to extract the number of delays.

Future considerations

If the above is correct, we could add another flag called 'delayed' in the callingpoint table, so we actually update the calling points, we can add whether they have been delayed, making database computations simpler.

Current data

A cronjob was installed in the PROD server to download the schedules every morning at 9:00am. The script looks as follows:

#!/bin/bash
wget "ftp://ftpuser:A!t4398htw4ho4jy@datafeeds.nationalrail.co.uk/*_v8.xml.gz" -P /home/ubuntu/daily_schedules/
# Send an email when the script is executed
echo "File Downloaded" | mail -s "Schedule File Downloaded" "alejandro@hackpartners.com"

The schedules for the past 3 days has been downloaded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions