diff --git a/Documentation/WMPL_Upgrades_2026April.md b/Documentation/WMPL_Upgrades_2026April.md new file mode 100644 index 00000000..3f4a1c08 --- /dev/null +++ b/Documentation/WMPL_Upgrades_2026April.md @@ -0,0 +1,183 @@ +# WMPL Upgrades +by Mark McIntyre, April 2026 + +## Key points + +- Added new operation mode to create candidates. +- Added distributed processing for candidates. +- Added checks for duplicate transactions. +- Replaced JSON database with SQLite databases. +- Slight change to command-line options. + +### Operation Modes + +The updated solver now has three core operational modes, numbered 4, 1, and 2. In the code these are MCMODE_CANDS, MCMODE_PHASE1 and MCMODE_PHASE2. The previous mode 1 has been split into two stages numbered 4 and 1 as explained below. + +Here's what each phase does. + +- In mcmode 4, the solver finds and saves candidate groups of observations. + During this phase, unpaired observations are loaded and candidate groups found. Observations are excluded if they're already marked as paired in the observations database, and potential candidate groups are also checked against the candidate database to avoid reanalysing combinations that were already found. Remaining new candidates are then added to the candidate database and saved to disk. + +- In mcmode 1, the solver loads candidates created by the previous step and attempts to find a simple solution. +If successful, the trajectory is saved to disk and a copy placed in the 'phase1' folder for further analysis, while the trajectory and observations databases are updated accordingly. If unsuccessful, the trajectory is added to the list of failed trajectories in the trajectories database. + +- In mcmode 2, the solver loads phase1 solutions and performs Monte-Carlo analysis. This mode is unchanged from previously. + +Some Bitwise combinations of modes are permitted as shown in the table below: + +| Value | Effect | Example Use | +| ------------------- | ------------------------------------------------------------ | ------------------------------------------- | +| 3
MCMODE_SIMPLE | Runs modes 1+2, i.e. loads and fully solves candidates. | UKMON currently uses this mode. | +| 5
MCMODE_BOTH | Runs modes 4+1, i.e. creates phase 1 solutions from scratch. | GMN currently uses this mode. | +| 7
MCMODE_ALL | Equivalent to 0 or passing no mcmode | Typically used during manual data analysis. | +| Any other value | Treated as a value of 7 | | + +Note that in modes 0, 3, 5 and 7, intermediate files (ie candidates and phase1 files) are not saved to disk. + +### Distributed Processing + +The solver supports distribution of both candidates and phase1 solutions to child nodes. + +To enable distributed processing, we require one master node and one or more child nodes. + +On the master, we create a configuration file '**wmpl_remote.cfg**' in the same folder as the databases and then run three instances of the Solver on a master node, one in each of mcmodes 4, 1 and 2 (more than one instance in mcmode 2 can be run). The content of the configuration file is explained below and a sample file is included in the repository. + +On each child, we also create a configuration file (see 'Child Node Configuration' below). Child nodes can run in mcmodes 1 or 2, collecting relevant data from the master node and uploading the results back. + +SFTP is used to move data between master and child, and each child must therefore have an SFTP account on the server hosting the master. + +Data are written into a 'files' folder in the sftp account's home directory, and therefore the account running the master instances of the solver must be able to read from, write to and create folders in a "files" directory in the children's home directories. On my test server I achieved this with POSIX ACLs and Unix group membership. + +Additionally, the solver itself sets permissions on files and subfolders, and these should not be altered. + +The required folder structure for one node is shown below. + +![image](node_structure.png) + +**Master Node Configuration** + +The configuration file for the master node specifies the child nodes that are available, the capacity of each node, and the mcmode that its operating in (modes 1 or 2, no other mcmode is supported). The capacity value can be any integer, with zero meaning the node is disabled and any negative value meaning the node has no capacity limit. + +When running in master mode, the instance in mcmode 4 will distribute candidates and the instance in mcmode 1 will distribute phase1 pickle files, provided suitable child nodes are configured. + +Example master-mode configuration file: + +\[mode\] +mode = master +\[children\] +node1 = /home/node1, 600, 1 +node2 = /home/node2, 500, 2 +node3 = /home/node3, 0, 1 + +This indicates that: + +- node 1 is running in mcmode 1 and has capacity of 600. +- node 2 is running in mcmode 2 and has capacity of 500. +- Node 3 is currently disabled (capacity zero) and will not be assigned data. + +If we bring node 3 online, we can change the capacity from zero to some suitable value, and the master will begin assigning candidates to it (see 'Dynamically Adding Nodes' below). + +If no nodes are available, or if all nodes are at capacity, any remaining data will be assigned to the master node. + +The master will also stop assigning data to a node if a special file named "stop" is present in the files folder of the child's SFTP home directory. The child nodes create this file when shutting down but it can also be created manually. + +Furthermore, if data has not been picked up by a child within six hours, then it will be reassigned to the master node. This ensures that data is left unprocessed if for example a node crashes unexpectedly. + +**Dynamically Adding Nodes** + +The master instance of the solver re-reads the remote configuration file on each loop, and so nodes can be added, removed, disabled or enabled on demand, without needing to restart the master. + +So, for example, one could create a configuration listing several child nodes with capacity set to zero, which would mean they were initially disabled and so the Solver would assign all candidates to the master node. However, if volumes rose, an instance of the solver could be started up on a child node and the master configuration file updated. On the master node's next loop, data would be automatically assigned to the children. + +You can also _manually_ move files between child node folders on the server. For instance, if you want to move some load from node1 to node2 you can move some of the candidate files from node1's _candidates_ folder to node2's _candidates_ folder. A UNIX command to do this might be + +_ls -1 ~node/files/candidates | head -100 | while read i ; do mv \$i ~node2/files/candidates; done_ + +**Processing Uploaded Data** + +Upon each loop round, the master node will scan each node's home directory for uploaded results. These will be integrated into the trajectories data, and the databases updated. + +**Child Node Configuration** + +The child must be running in mcmode 1 or 2 - no other mode is supported at present. + +The child configuration file specifies the server, user and key to use for connections to the master node. Port is optional but can be specified if a non-standard SFTP port is in use. + +\[mode\] +mode = child +
\[children\] +host = testserver.somewhere.com +user = node1 +key = ~/.ssh/somekey +port = 22 + +At startup, the child node will connect to the master and remove the "stop" file, if present. This indicates to the master that it is "open for business". The child will then loop around, downloading any assigned data and processing it. Downloaded files are moved to a subfolder _processed_ on the sftp server. Upon completion it will upload the results to the sftp server. + +**Stopping a Child Node** + +Any node can be terminated by pressing Ctrl-C or by sending SIGINT to its process. The node will stop processing immediately and create a "stop" file on the sftp server. + +Note that termination will leave data incompletely processed and no upload will take place, and so it is advisable to wait until the child's logfile indicates it is idle. + +Alternatively, one can identify the most recent, potentially incomplete, data set that was assigned to the node by looking in the child's _processed_ folders and copying the data back to the master node's _candidate_ or _phase1_ folders as appropriate. + +**Recovering from a Child Node Crash or Shutdown** + +If a child node crashes or is otherwise terminated during processing, the data can be recovered and redistributed to the master or other nodes, or indeed to the failed node after it has restarted. This can be done by looking in the _processed_ folders on the child, or if the child node is unavailable, in the child node's _processed_ folder on the master node, identifying the most recent data, and moving it as necessary. + +## Duplicate Transaction Checks + +A check has been introduced in both candidate finding and phase1 solving that examines the database for potential duplicate or mergeable trajectories. + +Duplicates are defined as trajectories that contain the same observations. When detected, the solution with the least ignored observations is retained and the duplicates are deleted. + +Mergeable trajectories are defined as those with at least one common observation. In principle these should never arise but in practice with a distributed processing model, it is possible. For example, a candidate might be found and handed off for solving but while it is still being solved, a new observation might be uploaded by a camera, and so on its next pass the candidate finder creates a second candidate with an additional observation and a different reference timestamp. When detected the mergeable trajectories are deleted and all observations are marked unpaired, so that on its next pass the candidate finder should identify a single combined candidate. + +## Databases + +The JSON database has been replaced by three SQLite databases, one for Observations, one for Trajectories and one for Candidates. + +This approach was taken because most database writing takes place during phase 1 solving, but some takes place during candidate finding notably when reprocessing previous trajectories with new observations. By splitting the databases, we minimise potential concurrent write situations. SQLite does not support multiple simultaneous writes, and though it will back off and retry after a few milliseconds, it is preferrable to avoid unnecessary delays. + +**If The Solver Crashes** + +Although most operations are immediately committed to the databases, it is possible for the solver to crash and leave an incomplete transaction. This will be revealed by the existence of write-ahead logs in the database directory e.g. "observations.db-wal". + +If this file is present, then upon next startup, SQLite will complete any pending transactions. This minimises the risk of data loss, but at worst may lead to observations being reprocessed. This is preferable to trajectories being missed. + +**The Legacy JSON database** + +The legacy JSON database is no longer used It is not deleted however, after an initial data migration described below it is no longer being used and can be moved to long-term storage if desired. + +**Initial Population of SQLite** + +When the Solver is started up, it checks for the existence of the new databases. If they are not present, it creates them and prepopulates them with the last few days of data from the old JSON database if available. For example, if run with the auto flag and default period of 5 days lookback, the last five days of data will be copied to SQLite. This ensures that sufficient observation and failed trajectory data is present for normal operation of the solver. + +The JSON database is then closed and is not referred to again even on subsequent runs of the solver. It is not truncated, archived or deleted and remains as an historical record of the state of the database as at the cutover date. + +**Historic Reruns** + +If the solver is rerun for an historic period from before the cutover, there will be no paired observations or failed trajectories data in the databases. The assumption is that if we are rerunning for an historic period, we are either looking to integrate new observations into the dataset or to recalculate trajectories using improved mathematical models. In either case it seems likely we'd want to start by reanalysing the raw data. + +That said, should we wish to copy historical data into the SQLite databases, this can be done with the command-line interface to CorrelateDB as shown below: + +_python -m wmpl.Trajectory.CorrelateDB --dir_path rms_data --action copy --timerange "(20251215-000000,20251222-000000)"_ + +This will copy observations and failed trajectories into SQLite from the JSON database in _rms_data_ for a date range 2025-12-15 to 2025-12-22, creating the SQLite databases if necessary. + +This is quite a slow operation - on my 4-core i7 desktop it takes about several minutes to copy a week's worth of data. + +## Command Line Options + +One option has been removed and two new options added + +Removed: + +- \--**remotehost**: this has been superseded by the remote configuration file + +Added: + +- \--**addlogsuffix**: default false - this adds a suffix to the logfile to indicate which phase is being run. + For example, with this flag passed, the logfile for a run in MCMODE*CANDS would be something like \_correlate_rms_20260214_121314_cands.log* whereas a phase-1 log file would be _correlate_rms_20260214_121314_simple.log_. + +- **\--archivemonths:** default 3: this specifies the number of months' data to keep in the databases. Data older than this number of months will be archived. A value of zero means keep everything. This flag is useful during testing or when rerunning for an historical data when you might not want to remove older data. \ No newline at end of file diff --git a/Documentation/node_structure.png b/Documentation/node_structure.png new file mode 100644 index 00000000..212cc11e Binary files /dev/null and b/Documentation/node_structure.png differ diff --git a/wmpl/Rebound/REBOUND.py b/wmpl/Rebound/REBOUND.py index 92ef3330..75620897 100644 --- a/wmpl/Rebound/REBOUND.py +++ b/wmpl/Rebound/REBOUND.py @@ -14,7 +14,7 @@ REBOUND_FOUND = True except ImportError: - print("REBOUND package not found. Install REBOUND and reboundx packages to use the REBOUND functions.") + # don't print a message here as its already printed whenever REBOUND_FOUND is False REBOUND_FOUND = False from wmpl.Utils.TrajConversions import ( diff --git a/wmpl/Trajectory/CorrelateDB.py b/wmpl/Trajectory/CorrelateDB.py new file mode 100644 index 00000000..4e920ed1 --- /dev/null +++ b/wmpl/Trajectory/CorrelateDB.py @@ -0,0 +1,1020 @@ +# The MIT License + +# Copyright (c) 2024 Mark McIntyre + +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: + +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. + +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +# THE SOFTWARE. + +""" Python scripts to manage the WMPL SQLite databases +""" +import os +import sqlite3 +import logging +import logging.handlers +import argparse +import datetime +import json +import numpy as np + +from wmpl.Utils.TrajConversions import datetime2JD, jd2Date + + +log = logging.getLogger("traj_correlator") + +############################################################ +# classes to handle the Observation and Trajectory databases +############################################################ + + +class ObservationsDatabase(): + """ + A class to handle the sqlite observations database transparently. + """ + + def __init__(self, db_path, db_name='observations.db', purge_records=False, verbose=False): + """ + Create an observations database instance + + Parameters: + db_path : path to the location of the database + db_name : name to use, typically observations.db + purge_records : boolean, if true then delete any existing records + + """ + db_full_name = os.path.join(db_path, f'{db_name}') + if verbose: + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + self.dbhandle = con + con.execute('pragma journal_mode=wal') + if purge_records: + con.execute('drop table if exists paired_obs') + res = con.execute("SELECT name FROM sqlite_master WHERE name='paired_obs'") + if res.fetchone() is None: + con.execute("CREATE TABLE paired_obs(obs_id VARCHAR(36) UNIQUE, obs_dt REAL, status INTEGER)") + self._commitObsDatabase() + + def _commitObsDatabase(self): + """ + Commit the obs db. This function exists so we can do lazy writes + """ + self.dbhandle.commit() + try: + self.dbhandle.execute('pragma wal_checkpoint(TRUNCATE)') + except Exception: + self.dbhandle.execute('pragma wal_checkpoint(PASSIVE)') + return + + def closeObsDatabase(self): + """ + Close the database, making sure we commit any pending updates + """ + + if self.dbhandle: + self._commitObsDatabase() + self.dbhandle.close() + self.dbhandle = None + return + + def checkObsPaired(self, obs_id, verbose=False): + """ + Check if an observation is already marked paired + return True if there is an observation with the correct obs id and with status = 1 + + Parameters: + obs_id : observation ID to check + + Returns: + True if paired, False otherwise + """ + + paired = True + cur = self.dbhandle.execute(f"SELECT obs_id FROM paired_obs WHERE obs_id='{obs_id}' and status=1") + if cur.fetchone() is None: + paired = False + if verbose: + log.info(f'{obs_id} is {"Paired" if paired else "Unpaired"}') + return paired + + def addPairedObservations(self, obs_ids, jdt_refs, verbose=False): + """ + Add or update a list of observations paired, setting status = 1 + + Parameters: + obs_ids : list of observation IDs + jdt_refs : list of julian reference dates of the observations + """ + + vals_str = ','.join(map(str,[(id, dt, 1) for id,dt in zip(obs_ids,jdt_refs)])) + + if verbose: + log.info(f'adding {obs_ids} to paired_obs table') + try: + self.dbhandle.execute(f"insert or replace into paired_obs values {vals_str}") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to add {obs_ids} to paired_obs table') + return False + + return + + def addPairedObs(self, obs_id, jdt_ref, verbose=False): + """ + Add or update a single entry in the database to mark an observation paired, setting status = 1 + + Parameters: + obs_id : observation ID + jdt_ref : julian reference date of the observation + """ + + if verbose: + log.info(f'adding {obs_id} to paired_obs table') + try: + self.dbhandle.execute(f"insert or replace into paired_obs values ('{obs_id}', {jdt_ref}, 1)") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to add {obs_id} to paired_obs table') + return False + + def unpairObs(self, obs_ids, verbose=False): + """ + Mark an observation unpaired. + If an entry exists in the database, update the status to 0. + ** Currently unused. ** + + Parameters: + met_obs_list : a list of observation IDs + """ + obs_ids_str = ','.join(f"'{id}'" for id in obs_ids) + + if verbose: + log.info(f'unpairing {obs_ids_str}') + try: + log.info('update: write to obsdb') + self.dbhandle.execute(f"update paired_obs set status = 0 where obs_id in ({obs_ids_str})") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to unpair {obs_ids_str}') + return False + + def getLinkedObservations(self, jdt_ref): + """ + Return a list of observation IDs linked with a trajectory based on the jdt_ref of the traj + + Parameters + jdt_ref : the julian reference date of the trajectory + + """ + cur = self.dbhandle.execute(f"SELECT obs_id FROM paired_obs WHERE obs_dt={jdt_ref} and status=1") + return [x[0] for x in cur.fetchall()] + + def archiveObsDatabase(self, db_path, arch_prefix, archdate_jd): + """ + archive records older than archdate_jd to a database {arch_prefix}_observations.db + + Parameters: + db_path : path to the location of the archive database + arch_prefix : prefix to apply - typically of the form yyyymm + archdate_jd : julian date before which to archive data + """ + # create the database if it doesnt exist + archdb_name = f'{arch_prefix}_observations.db' + archdb = ObservationsDatabase(db_path, archdb_name) + archdb.closeObsDatabase() + + # attach the arch db, copy the records then delete them + archdb_fullname = os.path.join(db_path, f'{archdb_name}') + self.dbhandle.execute(f"attach database '{archdb_fullname}' as archdb") + try: + # bulk-copy if possible + self.dbhandle.execute(f'insert or replace into archdb.paired_obs select * from paired_obs where obs_date < {archdate_jd}') + except Exception: + # otherwise, one by one + cur = self.dbhandle.execute(f'select * from paired_obs where obs_date < {archdate_jd}') + for row in cur.fetchall(): + try: + self.dbhandle.execute(f"insert into archdb.paired_obs values('{row[0]}','{row[1]}',{row[2]})") + except Exception: + log.info(f'{row[1]} already exists in target') + + log.info('delete: write to obsdb') + self.dbhandle.execute(f'delete from paired_obs where obs_date < {archdate_jd}') + self.dbhandle.commit() + return + + def copyObsJsonRecords(self, paired_obs, dt_range, max_days=14): + """ + Copy recent data from the legacy Json database to the new database. + By design this only copies at most the last seven days, but a date-range can be + provided so that relevant data is copied. + + Parameters: + paired_obs : a json list of paired observations from the old database + dt_range : a date range to operate on - at most fourteen days duration + + """ + # only copy recent observations since + dt_end = dt_range[1] + dt_beg = max(dt_range[0], dt_end + datetime.timedelta(days=-max_days)) + + log.info('-----------------------------') + log.info('moving recent observations to sqlite - this may take some time....') + log.info(f'observation date range {dt_beg.isoformat()} to {dt_end.isoformat()}') + + i = 0 + keylist = paired_obs.keys() + for stat_id in keylist: + for obs_id in paired_obs[stat_id]: + try: + obs_date = datetime.datetime.strptime(obs_id.split('_')[1], '%Y%m%d-%H%M%S.%f') + except Exception: + obs_date = datetime.datetime(2000,1,1,0,0,0) + obs_date = obs_date.replace(tzinfo=datetime.timezone.utc) + + if obs_date >= dt_beg and obs_date < dt_end: + self.addPairedObs(obs_id, datetime2JD(obs_date)) + i += 1 + if not i % 100000 and i != 0: + log.info(f'moved {i} observations') + self.dbhandle.commit() + log.info(f'done - moved {i} observations') + log.info('-----------------------------') + return + + def mergeObsDatabase(self, source_db_path): + """ + Merge in records from another database 'source_db_path', for example from a remote node + + Parameters: + source_db_path : full name and path to the source database to merge from + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + res = self.dbhandle.execute("SELECT name FROM sourcedb.sqlite_master WHERE name='paired_obs'") + if res.fetchone() is None: + # table is missing so nothing to do + status = True + else: + try: + self.dbhandle.execute('insert or replace into paired_obs select * from sourcedb.paired_obs') + status = True + except Exception as e: + log.info(f'unable to merge child observations from {source_db_path}') + log.info(e) + status = False + + self.dbhandle.commit() + self.dbhandle.execute("detach database 'sourcedb'") + return status + + +############################################################ + + +class TrajectoryDatabase(): + """ + A class to handle the sqlite trajectory database transparently. + """ + + def __init__(self, db_path, db_name='trajectories.db', purge_records=False, verbose=False): + """ + initialise the trajectory database + + Parameters: + db_path : path to the location to store the database + db_name : database name + purge_records : boolean, if true, delete any existing records + """ + + db_full_name = os.path.join(db_path, f'{db_name}') + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + if purge_records: + con.execute('drop table if exists trajectories') + con.execute('drop table if exists failed_trajectories') + con.commit() + res = con.execute("SELECT name FROM sqlite_master WHERE name='trajectories'") + if res.fetchone() is None: + if verbose: + log.info('create table: write to trajdb') + con.execute("""CREATE TABLE trajectories( + jdt_ref REAL UNIQUE, + traj_id VARCHAR UNIQUE, + traj_file_path VARCHAR, + participating_stations VARCHAR, + ignored_stations VARCHAR, + radiant_eci_mini VARCHAR, + state_vect_mini VARCHAR, + phase_1_only INTEGER, + v_init REAL, + gravity_factor REAL, + v0z REAL, + v_avg REAL, + rbeg_jd REAL, + rend_jd REAL, + rbeg_lat REAL, + rbeg_lon REAL, + rbeg_ele REAL, + rend_lat REAL, + rend_lon REAL, + rend_ele REAL, + obs_ids VARCHAR, + ign_obs_ids VARCHAR, + status INTEGER) """) + + res = con.execute("SELECT name FROM sqlite_master WHERE name='failed_trajectories'") + if res.fetchone() is None: + # note: traj_id not set as unique as some fails will have traj-id None + if verbose: + log.info('create table: write to trajdb') + con.execute("""CREATE TABLE failed_trajectories( + jdt_ref REAL UNIQUE, + traj_id VARCHAR, + traj_file_path VARCHAR, + participating_stations VARCHAR, + ignored_stations VARCHAR, + radiant_eci_mini VARCHAR, + state_vect_mini VARCHAR, + phase_1_only INTEGER, + v_init REAL, + gravity_factor REAL, + obs_ids VARCHAR, + ign_obs_ids VARCHAR, + status INTEGER) """) + + con.commit() + self.dbhandle = con + return + + def _commitTrajDatabase(self, verbose=False): + """ + commit the traj db. + This function exists so we can do lazy writes in some cases + """ + + if verbose: + log.info('commit: write to trajdb') + self.dbhandle.commit() + return + + def closeTrajDatabase(self, verbose=False): + """ + close the database, making sure we commit any pending updates + """ + + if verbose: + log.info('commit: write to trajdb') + if self.dbhandle: + self._commitTrajDatabase() + self.dbhandle.close() + self.dbhandle = None + return + + + def checkCandIfProcessed(self, jdt_ref, station_list, verbose=False): + """ + check if a candidate was already processed into the database + This function is not currently used. + + Parameters: + jdt_ref : candidate's julian reference date + station_list : candidate's list of stations + + Returns: + True if there is a trajectory with the same jdt_ref and matching list of stations as the candidate + """ + + found = False + res = self.dbhandle.execute(f"SELECT traj_id,participating_stations, ignored_stations FROM failed_trajectories WHERE jdt_ref={jdt_ref} and status=1") + row = res.fetchone() + if row is None: + found = False + else: + traj_stations = list(set(json.loads(row[1]) + json.loads(row[2]))) + found = True if (traj_stations == station_list) else False + if found: + return found + + res = self.dbhandle.execute(f"SELECT traj_id,participating_stations, ignored_stations FROM trajectories WHERE jdt_ref={jdt_ref} and status=1") + row = res.fetchone() + if row is None: + found = False + else: + traj_stations = list(set(json.loads(row[1]) + json.loads(row[2]))) + found = True if (traj_stations == station_list) else False + return found + + def checkTrajIfFailed(self, traj_reduced, verbose=False): + """ + Check if a Trajectory was marked failed + + Parameters: + traj_reduced : a TrajReduced object + + Returns + True if there is a failed trajectory with the same jdt_ref and matching list of stations + """ + + if not hasattr(traj_reduced, 'jdt_ref') or not hasattr(traj_reduced, 'participating_stations') or not hasattr(traj_reduced, 'ignored_stations'): + return False + + found = False + station_list = list(set(traj_reduced.participating_stations + traj_reduced.ignored_stations)) + res = self.dbhandle.execute(f"SELECT traj_id,participating_stations, ignored_stations FROM failed_trajectories WHERE jdt_ref={traj_reduced.jdt_ref} and status=1") + row = res.fetchone() + if row is None: + found = False + else: + traj_stations = list(set(json.loads(row[1]) + json.loads(row[2]))) + found = True if (traj_stations == station_list) else False + return found + + def addTrajectory(self, traj_reduced, failed=False, force_add=True, verbose=False): + """ + add or update an entry in the database, setting status = 1 + + Parameters: + traj_reduced : a TrajReduced object + failed : boolean, if true, add the traj to the fails list + + """ + + tblname = 'failed_trajectories' if failed else 'trajectories' + + # if force_add is false, don't replace any existing entry + if not force_add and hasattr(traj_reduced, 'traj_id') and traj_reduced.traj_id is not None: + res = self.dbhandle.execute(f'select traj_id from {tblname} where status = 1 and traj_id = "{traj_reduced.traj_id}"') + row = res.fetchone() + if row is not None and row[0] !='None': + return True + + if verbose: + log.info(f'adding jdt {traj_reduced.jdt_ref} to {tblname}') + + # remove the output_dir part from the path so that the data are location-independent + traj_file_path = traj_reduced.traj_file_path[traj_reduced.traj_file_path.find('trajectories'):] + + # and remove windows-style path separators + traj_file_path = traj_file_path.replace('\\','/') + + obs_ids = 'None' if not hasattr(traj_reduced, 'obs_ids') or traj_reduced.obs_ids is None else traj_reduced.obs_ids + ign_obs_ids = 'None' if not hasattr(traj_reduced, 'ign_obs_ids') or traj_reduced.ign_obs_ids is None else traj_reduced.ign_obs_ids + + if failed: + # fixup possible bad values + traj_id = 'None' if not hasattr(traj_reduced, 'traj_id') or traj_reduced.traj_id is None else traj_reduced.traj_id + v_init = 0 if traj_reduced.v_init is None else traj_reduced.v_init + radiant_eci_mini = [0,0,0] if traj_reduced.radiant_eci_mini is None else traj_reduced.radiant_eci_mini + state_vect_mini = [0,0,0] if traj_reduced.state_vect_mini is None else traj_reduced.state_vect_mini + + sql_str = (f'insert or replace into failed_trajectories values (' + f"{traj_reduced.jdt_ref}, '{traj_id}', '{traj_file_path}'," + f"'{json.dumps(traj_reduced.participating_stations)}'," + f"'{json.dumps(traj_reduced.ignored_stations)}'," + f"'{json.dumps(radiant_eci_mini)}'," + f"'{json.dumps(state_vect_mini)}'," + f"0,{v_init},{traj_reduced.gravity_factor}," + f"'{json.dumps(obs_ids)}'," + f"'{json.dumps(ign_obs_ids)}',1)") + else: + sql_str = (f'insert or replace into trajectories values (' + f"{traj_reduced.jdt_ref}, '{traj_reduced.traj_id}', '{traj_file_path}'," + f"'{json.dumps(traj_reduced.participating_stations)}'," + f"'{json.dumps(traj_reduced.ignored_stations)}'," + f"'{json.dumps(traj_reduced.radiant_eci_mini)}'," + f"'{json.dumps(traj_reduced.state_vect_mini)}'," + f"{traj_reduced.phase_1_only},{traj_reduced.v_init},{traj_reduced.gravity_factor}," + f"{traj_reduced.v0z},{traj_reduced.v_avg}," + f"{traj_reduced.rbeg_jd},{traj_reduced.rend_jd}," + f"{traj_reduced.rbeg_lat},{traj_reduced.rbeg_lon},{traj_reduced.rbeg_ele}," + f"{traj_reduced.rend_lat},{traj_reduced.rend_lon},{traj_reduced.rend_ele}," + f"'{json.dumps(obs_ids)}'," + f"'{json.dumps(ign_obs_ids)}',1)") + + sql_str = sql_str.replace('nan','"NaN"') + try: + self.dbhandle.execute(sql_str) + except Exception as e: + print(e) + print(sql_str) + self.dbhandle.commit() + return True + + def removeTrajectory(self, traj_reduced, failed=False, verbose=False): + """ + Mark a trajectory unsolved + If an entry exists, update the status to 0. + + Parameters: + traj_reduced : a TrajReduced object + failed : boolean, if true then remove from the fails list + """ + if verbose: + log.info(f'removing {traj_reduced.traj_id}') + table_name = 'failed_trajectories' if failed else 'trajectories' + + self.dbhandle.execute(f"update {table_name} set status=0 where jdt_ref='{traj_reduced.jdt_ref}'") + self.dbhandle.commit() + + return True + + def removeTrajectoryById(self, traj_id, failed=False, verbose=False): + """ + Mark a trajectory unsolved + If an entry exists, update the status to 0. + + Parameters: + traj_id : a trajectory ID + failed : boolean, if true then remove from the fails list + """ + if verbose: + log.info(f'removing {traj_id}') + table_name = 'failed_trajectories' if failed else 'trajectories' + + self.dbhandle.execute(f"update {table_name} set status=0 where traj_id='{traj_id}'") + self.dbhandle.commit() + + return True + + + def getTrajectories(self, output_dir, jdt_range, failed=False, verbose=False): + """ + Get a list of trajectories between two julian dates + + Parameters: + output_dir : output_dir specified when invoking CorrelateRMS - will be prepended to the trajectory path + jdt_range : tuple of julian dates to retrieve data between. if the 2nd date is None, retrieve all data to today + failed : boolean - if true, retrieve failed traj rather than successful ones + + Returns: + trajs: json list of traj_reduced objects + """ + + jdt_start, jdt_end = jdt_range + + table_name = 'failed_trajectories' if failed else 'trajectories' + if verbose: + log.info(f'getting trajectories between {jd2Date(jdt_start, dt_obj=True).strftime("%Y%m%d_%M%M%S.%f")} and {jd2Date(jdt_end, dt_obj=True).strftime("%Y%m%d_%M%M%S.%f")}') + + if not jdt_end: + self.dbhandle.execute(f"SELECT * FROM {table_name} WHERE jdt_ref={jdt_start}") + rows = cur.fetchall() + else: + rows = self.dbhandle.execute(f"SELECT * FROM {table_name} WHERE jdt_ref>={jdt_start} and jdt_ref<={jdt_end}") + trajs = [] + for rw in rows.fetchall(): + rw = [np.nan if x == 'NaN' else x for x in rw] + json_dict = {'jdt_ref':rw[0], 'traj_id':rw[1], 'traj_file_path':os.path.join(output_dir, rw[2]), + 'participating_stations': json.loads(rw[3]), + 'ignored_stations': json.loads(rw[4]), + 'radiant_eci_mini': json.loads(rw[5]), + 'state_vect_mini': json.loads(rw[6]), + 'phase_1_only': rw[7], 'v_init': rw[8],'gravity_factor': rw[9], + 'v0z': rw[10], 'v_avg': rw[11], + 'rbeg_jd': rw[12], 'rend_jd': rw[13], + 'rbeg_lat': rw[14], 'rbeg_lon': rw[15], 'rbeg_ele': rw[16], + 'rend_lat': rw[17], 'rend_lon': rw[18], 'rend_ele': rw[19], + 'obs_ids': json.loads(rw[20]), 'ign_obs_ids': json.loads(rw[21]), + } + + trajs.append(json_dict) + return trajs + + def getTrajBasics(self, output_dir, jdt_range, failed=False, verbose=False): + """ + Get a list of minimal trajectory details between two dates + + Parameters: + output_dir : output_dir specified when invoking CorrelateRMS - will be prepended to the trajectory path + jdt_range : tuple of julian dates to retrieve data betwee + failed : boolean, if true retrieve names of fails, otherwise retrieve successful + + Returns: + trajs: a json list of tuples of {jdt_ref, traj_id, traj_file_path} + + """ + + jdt_start, jdt_end = jdt_range + table_name = 'failed_trajectories' if failed else 'trajectories' + if not jdt_start: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} where status=1 order by jdt_ref") + rows = cur.fetchall() + elif not jdt_end: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} WHERE jdt_ref={jdt_start} and status=1 order by jdt_ref") + rows = cur.fetchall() + else: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} WHERE jdt_ref>={jdt_start} and jdt_ref<={jdt_end} and status=1 order by jdt_ref") + rows = cur.fetchall() + trajs = [] + for rw in rows: + trajs.append({'jdt_ref':rw[0], 'traj_id':rw[1], 'traj_file_path':os.path.join(output_dir, rw[2]), + 'obs_ids':json.loads(rw[3]), 'ign_obs_ids':json.loads(rw[4])}) + return trajs + + def archiveTrajDatabase(self, db_path, arch_prefix, archdate_jd): + """ + # archive records older than archdate_jd to a database {arch_prefix}_trajectories.db + + Parameters: + db_path : path to the location of the archive database + arch_prefix : prefix to apply - typically of the form yyyymm + archdate_jd : julian date before which to archive data + + """ + + # create the archive database if it doesnt exist + archdb_name = f'{arch_prefix}_trajectories.db' + archdb = TrajectoryDatabase(db_path, archdb_name) + archdb.closeTrajDatabase() + + # attach the arch db, copy the records then delete them + archdb_fullname = os.path.join(db_path, f'{archdb_name}') + cur = self.dbhandle.execute(f"attach database '{archdb_fullname}' as archdb") + log.info('delete: write to trajdb') + for table_name in ['trajectories', 'failed_trajectories']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into archdb.{table_name} select * from {table_name} where jdt_ref < {archdate_jd}') + cur.execute(f'delete from {table_name} where jdt_ref < {archdate_jd}') + except Exception: + log.warning(f'unable to archive {table_name}') + + self.dbhandle.commit() + return + + def copyTrajJsonRecords(self, trajectories, dt_range, failed=True, max_days=14): + """ + Copy trajectories from the old Json database + We generally only copy recent records since if we ever run for an historic date + its likely we will want to reanalyse all available data + + Parameters: + + trajectories : json list of trajetories extracted from the old Json DB + dt_range: : date range to use, at most fourteen days at a time + failed : boolean, default true to move failed traj + + """ + jd_end = datetime2JD(dt_range[1]) + jd_beg = max(datetime2JD(dt_range[0]), jd_end - max_days) + + log.info(f'moving recent {"" if failed is False else "failed"} trajectories to sqlite - this may take some time....') + log.info(f'trajectory date range {jd2Date(jd_beg, dt_obj=True).isoformat()} to {dt_range[1].isoformat()}') + + keylist = [k for k in trajectories.keys() if float(k) >= jd_beg and float(k) <= jd_end] + i = 0 # just in case there aren't any trajectories to move + for i,jdt_ref in enumerate(keylist): + self.addTrajectory(trajectories[jdt_ref], failed=failed) + i += 1 + if not i % 10000: + self._commitTrajDatabase() + log.info(f'moved {i} {"" if failed is False else "failed"} trajectories') + self._commitTrajDatabase() + log.info(f'done - moved {i} {"" if failed is False else "failed"} trajectories') + + return + + def mergeTrajDatabase(self, source_db_path): + """ + merge in records from another database, for example from a remote node + + Parameters: + source_db_path : the full name of the source database from which to merge in records + + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + cur = self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + + status = True + for table_name in ['trajectories', 'failed_trajectories']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into {table_name} select * from sourcedb.{table_name}') + except Exception: + log.warning(f'unable to merge data from {source_db_path}') + status = False + self.dbhandle.commit() + cur.execute("detach database 'sourcedb'") + return status + + +############################################################ + + +class CandidateDatabase(): + """ + A class to handle the sqlite candidates database transparently. + """ + + def __init__(self, db_path:str, db_name='candidates.db', keep=3, verbose=False): + """ + Create a database instance + + Parameters: + db_path : path to the location of the database + db_name : name to use, typically candidates.db + keep : number of weeks' data to keep. Default 3 + + """ + db_full_name = os.path.join(db_path, f'{db_name}') + if verbose: + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + con.execute('pragma journal_mode=wal') + res = con.execute("SELECT name FROM sqlite_master WHERE name='candidates'") + if res.fetchone() is None: + con.execute("CREATE TABLE candidates(cand_id VARCHAR UNIQUE, ref_dt REAL, obs_ids VARCHAR, status INTEGER)") + con.commit() + self.dbhandle = con + if keep > 0: + self.purgeCands(keep=keep) + + def _commitCandDatabase(self): + """ + Commit the db. This function exists so we can do lazy writes + """ + self.dbhandle.commit() + try: + self.dbhandle.execute('pragma wal_checkpoint(TRUNCATE)') + except Exception: + self.dbhandle.execute('pragma wal_checkpoint(PASSIVE)') + return + + def closeCandDatabase(self): + """ + Close database, making sure we commit any pending updates + """ + if self.dbhandle: + self._commitCandDatabase() + self.dbhandle.close() + self.dbhandle = None + return + + def checkAndAddCand(self, cand_id:str, ref_dt:float, obs_ids:list, verbose=False): + """ + Check and add a candidate if its not already there + + Parameters: + cand_id : candidate ID + obs_ids : list of observation IDs + + Returns: + True if added, False if its already present + """ + + present = True + cur = self.dbhandle.execute(f"SELECT * FROM candidates WHERE cand_id='{cand_id}' and status=1") + if cur.fetchone() is not None: + present = False + else: + present = True + obs_ids_str = json.dumps(list(set(obs_ids))) + self.dbhandle.execute(f"insert into candidates values ('{cand_id}',{ref_dt},'{obs_ids_str}',1)") + self.dbhandle.commit() + if verbose: + log.info(f'{cand_id} is {"added" if present else "not added"}') + return present + + def getCandidate(self, cand_id:str, verbose=False): + """ + retrieve details of a candidate + + Parameters: + cand_id : candidate ID + + Returns: + the observations linked to the candidate + """ + + obs_ids = [] + cur = self.dbhandle.execute(f"SELECT * FROM candidates WHERE cand_id='{cand_id}' and status=1") + rw = cur.fetchone() + if rw is not None: + obs_ids= json.loads(rw[1]) + if verbose: + log.info(f'{cand_id} contains {obs_ids}') + return obs_ids + + + def purgeCands(self,keep=3): + """ + purge old candidates after 'keep' weeks + + Parameters: + keep : weeks to keep data for, default 3 + """ + keep_dt = datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - datetime.timedelta(days=keep*7) + self.dbhandle.execute(f"delete from candidates where ref_dt < {keep_dt.timestamp()}") + self.dbhandle.commit() + return + + def mergeCandDatabase(self, source_db_path): + """ + merge in records from another observation database, for example from a remote node + + Parameters: + source_db_path : the full name of the source database from which to merge in records + + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + cur = self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + + status = True + for table_name in ['candidates']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into {table_name} select * from sourcedb.{table_name}') + except Exception: + log.warning(f'unable to merge data from {source_db_path}') + status = False + self.dbhandle.commit() + cur.execute("detach database 'sourcedb'") + return status + + +################################################################################## +# dummy classes for use in the above. +# We can't import from CorrelateRMS as this would create a circular reference + + +class DummyTrajReduced(): + """ + a dummy class for handling TrajReduced objects. + We can't import CorrelateRMS as that would create a circular dependency + """ + def __init__(self, jdt_ref=None, traj_id=None, traj_file_path=None, json_dict=None): + if json_dict is None: + self.jdt_ref = jdt_ref + self.traj_id = traj_id + self.traj_file_path = traj_file_path + else: + self.__dict__ = json_dict + + +class dummyDatabaseJSON(): + """ + Dummy class to handle the old Json data format + We can't import CorrelateRMS as that would create a circular dependency + """ + def __init__(self, db_dir, dt_range=None): + self.db_file_path = os.path.join(db_dir, 'processed_trajectories.json') + self.paired_obs = {} + self.failed_trajectories = {} + if os.path.exists(self.db_file_path): + self.__dict__ = json.load(open(self.db_file_path)) + + if hasattr(self, 'failed_trajectories'): + # Convert trajectories from JSON to TrajectoryReduced objects + traj_dict = getattr(self, "failed_trajectories") + trajectories_obj_dict = {} + for traj_json in traj_dict: + traj_reduced_tmp = DummyTrajReduced(json_dict=traj_dict[traj_json]) + trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp + setattr(self, "failed_trajectories", trajectories_obj_dict) + + if hasattr(self, 'trajectories'): + # Convert trajectories from JSON to TrajectoryReduced objects + traj_dict = getattr(self, "trajectories") + trajectories_obj_dict = {} + for traj_json in traj_dict: + traj_reduced_tmp = DummyTrajReduced(json_dict=traj_dict[traj_json]) + trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp + setattr(self, "trajectories", trajectories_obj_dict) + + +################################################################################## + + +if __name__ == '__main__': + arg_parser = argparse.ArgumentParser(description="""Automatically compute trajectories from RMS data in the given directory.""", + formatter_class=argparse.RawTextHelpFormatter) + + arg_parser.add_argument('--dir_path', type=str, default=None, help='Path to the directory containing the databases.') + + arg_parser.add_argument('--database', type=str, default=None, help='Database to process, either observations or trajectories') + + arg_parser.add_argument('--action', type=str, default=None, help='Action to take on the database') + + arg_parser.add_argument('--stmt', type=str, default=None, help='statement to execute eg "select * from paired_obs"') + + arg_parser.add_argument("--logdir", type=str, default=None, + help="Path to the directory where the log files will be stored. If not given, a logs folder will be created in the database folder") + + arg_parser.add_argument('-r', '--timerange', metavar='TIME_RANGE', + help="""Apply action to this date range in the format: "(YYYYMMDD-HHMMSS,YYYYMMDD-HHMMSS)".""", type=str) + + cml_args = arg_parser.parse_args() + # Find the log directory + log_dir = cml_args.logdir + if log_dir is None: + log_dir = os.path.join(cml_args.dir_path, 'logs') + os.makedirs(log_dir, exist_ok=True) + log.setLevel(logging.DEBUG) + + # Init the log formatter + log_formatter = logging.Formatter( + fmt='%(asctime)s-%(levelname)-5s-%(module)-15s:%(lineno)-5d- %(message)s', + datefmt='%Y/%m/%d %H:%M:%S') + + # Init the file handler + timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") + log_file = os.path.join(log_dir, f"correlate_db_{timestamp}.log") + file_handler = logging.handlers.TimedRotatingFileHandler(log_file, when="midnight", backupCount=7) + file_handler.setFormatter(log_formatter) + log.addHandler(file_handler) + + # Init the console handler (i.e. print to console) + console_handler = logging.StreamHandler() + console_handler.setFormatter(log_formatter) + log.addHandler(console_handler) + + if cml_args.database: + dbname = cml_args.database.lower() + action = cml_args.action.lower() + + stmt = cml_args.stmt + + dt_range = None + if cml_args.timerange is not None: + time_beg, time_end = cml_args.timerange.strip("(").strip(")").split(",") + dt_beg = datetime.datetime.strptime(time_beg, "%Y%m%d-%H%M%S").replace(tzinfo=datetime.timezone.utc) + dt_end = datetime.datetime.strptime(time_end, "%Y%m%d-%H%M%S").replace(tzinfo=datetime.timezone.utc) + log.info("Custom time range:") + log.info(" BEG: {:s}".format(str(dt_beg))) + log.info(" END: {:s}".format(str(dt_end))) + dt_range = [dt_beg, dt_end] + + + if action == 'copy': + if dt_range is None: + log.info('Date range must be provided for copy operation') + else: + dt_range_jd = [datetime2JD(dt_range[0]),datetime2JD(dt_range[1])] + jsondb = dummyDatabaseJSON(db_dir=cml_args.dir_path) + obsdb = ObservationsDatabase(cml_args.dir_path) + obsdb.copyObsJsonRecords(jsondb.paired_obs, dt_range) + obsdb.closeObsDatabase() + trajdb = TrajectoryDatabase(cml_args.dir_path) + trajdb.copyTrajJsonRecords(jsondb.failed_trajectories, dt_range, failed=True) + trajdb.copyTrajJsonRecords(jsondb.trajectories, dt_range, failed=False) + trajdb.closeTrajDatabase() + else: + if dbname == 'observations': + obsdb = ObservationsDatabase(cml_args.dir_path) + if action == 'status': + cur = obsdb.dbhandle.execute('select * from paired_obs where status=1') + print(f'there are {len(cur.fetchall())} paired obs') + cur = obsdb.dbhandle.execute('select * from paired_obs where status=0') + print(f'and {len(cur.fetchall())} unpaired obs') + if action == 'execute': + print(stmt) + cur = obsdb.dbhandle.execute(stmt) + for rw in cur.fetchall(): + print(rw) + obsdb.closeObsDatabase() + + elif dbname == 'trajectories': + trajdb = TrajectoryDatabase(cml_args.dir_path) + if action == 'status': + cur = trajdb.dbhandle.execute('select * from trajectories where status=1') + print(f'there are {len(cur.fetchall())} successful trajectories') + cur = trajdb.dbhandle.execute('select * from failed_trajectories') + print(f'and {len(cur.fetchall())} failed trajectories') + if action == 'execute': + print(stmt) + cur = trajdb.dbhandle.execute(stmt) + for rw in cur.fetchall(): + print(rw) + trajdb.closeTrajDatabase() + else: + log.info('valid database not specified') diff --git a/wmpl/Trajectory/CorrelateEngine.py b/wmpl/Trajectory/CorrelateEngine.py index 52ff61f1..3c3a482c 100644 --- a/wmpl/Trajectory/CorrelateEngine.py +++ b/wmpl/Trajectory/CorrelateEngine.py @@ -8,7 +8,6 @@ import multiprocessing import logging import os - import numpy as np from wmpl.Trajectory.Trajectory import ObservedPoints, PlaneIntersection, Trajectory, moveStateVector @@ -19,11 +18,34 @@ from wmpl.Utils.TrajConversions import J2000_JD, geo2Cartesian, cartesian2Geo, raDec2AltAz, altAz2RADec, \ raDec2ECI, datetime2JD, jd2Date, equatorialCoordPrecession_vect +MCMODE_NONE = 0 +MCMODE_PHASE1 = 1 +MCMODE_PHASE2 = 2 +MCMODE_CANDS = 4 +MCMODE_SIMPLE = MCMODE_CANDS + MCMODE_PHASE1 +MCMODE_BOTH = MCMODE_PHASE1 + MCMODE_PHASE2 +MCMODE_ALL = MCMODE_CANDS + MCMODE_PHASE1 + MCMODE_PHASE2 + # Grab the logger from the main thread log = logging.getLogger("traj_correlator") +def getMcModeStr(mcmode, strtype=0): + modestrs = {4:'cands', 1:'simple', 2:'mcphase', 5:'candsimple', 3:'simplemc',7:'full',0:'full'} + fullmodestrs = {4:'CANDIDATE STAGE', 1:'SIMPLE STAGE', 2:'MONTE CARLO STAGE', 7:'FULL',0:'FULL'} + if strtype == 0: + if mcmode in fullmodestrs.keys(): + return fullmodestrs[mcmode] + else: + return 'MIXED' + else: + if mcmode in modestrs.keys(): + return modestrs[mcmode] + else: + return False + + def pickBestStations(obslist, max_stns): """ Find the stations with the best statistics @@ -239,6 +261,8 @@ def __init__(self, data_handle, traj_constraints, v_init_part, data_in_j2000=Tru # enable OS style ground maps if true self.enableOSM = enableOSM + self.candidatemode = None + def trajectoryRangeCheck(self, traj_reduced, platepar): """ Check that the trajectory is within the range limits. @@ -423,7 +447,9 @@ def checkFOVOverlap(self, rp, tp): def initObservationsObject(self, met, pp, ref_dt=None): - """ Init the observations object which will be fed into the trajectory solver. """ + """ + Init the observations object which will be fed into the trajectory solver. + """ # If the reference datetime is given, apply a time offset if ref_dt is not None: @@ -471,12 +497,18 @@ def initObservationsObject(self, met, pp, ref_dt=None): np.radians(pp.lon), pp.elev, meastype=1, station_id=pp.station_code, magnitudes=mag_data, ignore_list=ignore_list, fov_beg=met.fov_beg, fov_end=met.fov_end, comment=comment) + # we seem to have two variables for observation id - need to tidy up this! + obs.id = met.id if hasattr(met, 'id') else None + obs.obs_id = obs.id + return obs def projectPointToTrajectory(self, indx, obs, plane_intersection): - """ Compute lat, lon and height of given point on the meteor trajectory. """ + """ + Compute lat, lon and height of given point on the meteor trajectory. + """ meas_vector = obs.meas_eci[indx] jd = obs.JD_data[indx] @@ -493,7 +525,9 @@ def projectPointToTrajectory(self, indx, obs, plane_intersection): def quickTrajectorySolution(self, obs1, obs2): - """ Perform an intersecting planes solution and check if it satisfies specified sanity checks. """ + """ + Perform an intersecting planes solution and check if it satisfies specified sanity checks. + """ # Do the plane intersection solution plane_intersection = PlaneIntersection(obs1, obs2) @@ -535,8 +569,8 @@ def quickTrajectorySolution(self, obs1, obs2): or (ht2_end < self.traj_constraints.min_end_ht): log.info("Meteor heights outside allowed range!") - log.info("H1_beg: {:.2f}, H1_end: {:.2f}".format(ht1_beg, ht1_end)) - log.info("H2_beg: {:.2f}, H2_end: {:.2f}".format(ht2_beg, ht2_end)) + log.info(" H1_beg: {:.2f}, H1_end: {:.2f}".format(ht1_beg, ht1_end)) + log.info(" H2_beg: {:.2f}, H2_end: {:.2f}".format(ht2_beg, ht2_end)) return None @@ -601,7 +635,7 @@ def initTrajectory(self, jdt_ref, mc_runs, verbose=False): return traj - def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=None): + def solveTrajectory(self, traj, mc_runs, mcmode=MCMODE_ALL, matched_obs=None, orig_traj=None, verbose=False): """ Given an initialized Trajectory object with observation, run the solver and automatically reject bad observations. @@ -630,23 +664,23 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # make a note of how many observations are already marked ignored. initial_ignore_count = len([obs for obs in traj.observations if obs.ignore_station]) log.info(f'initially ignoring {initial_ignore_count} stations...') + successful_traj_fit = False - # run the first phase of the solver if mcmode is 0 or 1 - if mcmode < 2: + # run the first phase of the solver if mcmode is MCMODE_PHASE1 + if mcmode & MCMODE_PHASE1: # Disable Monte Carlo runs until an initial stable set of observations is found traj.monte_carlo = False # Run the solver try: traj_status = traj.run() - # If solving has failed, stop solving the trajectory except ValueError as e: log.info("Error during trajectory estimation!") print(e) + # TODO do we need to add the trajectory to the failed traj database here? return False - # Reject bad observations until a stable set is found, but only if there are more than 2 # stations. Only one station will be rejected at one point in time successful_traj_fit = False @@ -678,7 +712,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Skip this part if there are less than 3 stations - if len(traj.observations) < 3: + active_obs = [obstmp for obstmp in traj.observations if not obstmp.ignore_station] + if len(active_obs) < 3: break @@ -707,7 +742,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N max_rejections_possible = int(np.ceil(0.5*len(traj_status.observations))) + initial_ignore_count log.info(f'max stations allowed to be rejected is {max_rejections_possible}') for i, obs in enumerate(traj_status.observations): - + if obs.ignore_station: + continue # Compute the median angular uncertainty of all other non-ignored stations ang_res_list = [obstmp.ang_res_std for j, obstmp in enumerate(traj_status.observations) if (i != j) and not obstmp.ignore_station] @@ -718,10 +754,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N ang_res_median = np.median(ang_res_list) - # ### DEBUG PRINT - # print(obs.station_id, 'ang res:', np.degrees(obs.ang_res_std)*3600, \ - # np.degrees(ang_res_median)*3600) - # Check if the current observations is larger than the minimum limit, and # outside the median limit or larger than the maximum limit if (obs.ang_res_std > np.radians(self.traj_constraints.min_arcsec_err/3600)) \ @@ -795,19 +827,26 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Init a new trajectory object (make sure to use the new reference Julian date) - traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=False) + traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=verbose) # Disable Monte Carlo runs until an initial stable set of observations is found traj.monte_carlo = False - # Reinitialize the observations, rejecting the ignored stations + # Reinitialize the observations. Note we *include* the ignored obs as they're internally marked ignored + # and so will be skipped, but to avoid confusion in the logs we only print the names of the non-ignored ones for obs in traj_status.observations: + traj.infillWithObs(obs) if not obs.ignore_station: - log.info(f'Adding {obs.station_id}') - traj.infillWithObs(obs) + log.info(f'Adding {obs.obs_id}') log.info("") - log.info(f'Rerunning the trajectory solution with {len(traj.observations)} stations...') + active_stns = len([obs for obs in traj.observations if not obs.ignore_station]) + if active_stns < 2: + log.info(f"Only {active_stns} stations left - trajectory estimation failed!") + skip_trajectory = True + break + + log.info(f'Rerunning the trajectory solution with {active_stns} stations...') # Re-run the trajectory solution try: traj_status = traj.run() @@ -816,7 +855,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N except ValueError as e: log.info("Error during trajectory estimation!") print(e) - return False + skip_trajectory = True + break # If the trajectory estimation failed, skip this trajectory @@ -835,21 +875,17 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Skip the trajectory if no good solution was found if skip_trajectory: - # Add the trajectory to the list of failed trajectories - self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref) - log.info("Trajectory skipped and added to fails!") + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref, verbose=verbose) return False - # # If the trajectory solutions was not done at any point, skip the trajectory completely - # if traj_best is None: - # return False - - # # Otherwise, use the best trajectory solution until the solving failed - # else: - # log.info("Using previously estimated best trajectory...") - # traj_status = traj_best - + # restore the obs ids + traj_status.obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is False] + traj_status.ign_obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is True] # If there are only two stations, make sure to reject solutions which have stations with # residuals higher than the maximum limit @@ -857,11 +893,13 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N if np.any([(obstmp.ang_res_std > np.radians(self.traj_constraints.max_arcsec_err/3600)) for obstmp in traj_status.observations]): + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + traj_status.obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is False] + traj_status.ign_obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is True] log.info("2 station only solution, one station has an error above the maximum limit, skipping!") - # Add the trajectory to the list of failed trajectories - self.dh.addTrajectory(traj_status, failed_jdt_ref=jdt_ref) - + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + self.dh.addTrajectory(traj_status, failed_jdt_ref=jdt_ref, verbose=verbose) return False @@ -869,7 +907,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N traj = traj_status # if we're only doing the simple solution, then print the results - if mcmode == 1: + if mcmode == MCMODE_PHASE1: # Only proceed if the orbit could be computed if traj.orbit.ra_g is not None: # Update trajectory file name @@ -885,18 +923,16 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N else: shower_code = shower_obj.IAU_code log.info("Shower: {:s}".format(shower_code)) + + if mcmode & MCMODE_PHASE1: successful_traj_fit = True log.info('finished initial solution') ##### end of simple soln phase ##### now run the Monte-carlo phase, if the mcmode is 0 (do both) or 2 (mc-only) - if mcmode == 0 or mcmode == 2: - if mcmode == 2: - traj_status = traj + if mcmode & MCMODE_PHASE2: + traj_status = traj - # save the traj in case we need to clean it up - save_traj = traj - # Only proceed if the orbit could be computed if traj.orbit.ra_g is not None: @@ -905,7 +941,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N log.info("Stable set of observations found, computing uncertainties using Monte Carlo...") # Init a new trajectory object (make sure to use the new reference Julian date) - traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=False) + traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=verbose) # Enable Monte Carlo traj.monte_carlo = True @@ -918,7 +954,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Don't do this in mc-only mode since phase1 has already selected the stations and we could # create duplicate orbits if we now exclude some stations from the solution # TODO should we do this here *at all* ? - if len(non_ignored_observations) > self.traj_constraints.max_stations and mcmode != 2: + if len(non_ignored_observations) > self.traj_constraints.max_stations and mcmode != MCMODE_PHASE2: # Sort the observations by residuals (smallest first) # TODO: implement better sorting algorithm @@ -940,6 +976,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Reinitialize the observations, rejecting ignored stations for obs in obs_selected: if not obs.ignore_station: + #log.info(f'adding obs_id {obs.obs_id}') traj.infillWithObs(obs) @@ -951,7 +988,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N except ValueError as e: log.info("Error during trajectory estimation!") print(e) - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -959,10 +995,12 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N if traj_status is None: # Add the trajectory to the list of failed trajectories - if mcmode != 2: - self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref) - log.info('Trajectory failed to solve') - self.dh.cleanupPhase2TempPickle(save_traj) + if mcmode != MCMODE_PHASE2: + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref, verbose=verbose) + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") return False @@ -975,7 +1013,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N log.info("Average velocity outside range: {:.1f} < {:.1f} < {:.1f} km/s, skipping...".format(self.traj_constraints.v_avg_min, traj.orbit.v_avg/1000, self.traj_constraints.v_avg_max)) - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -983,14 +1020,12 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N for obs in traj.observations: if (obs.rbeg_ele is None) and (not obs.ignore_station): log.info("Heights from observations failed to be estimated!") - self.dh.cleanupPhase2TempPickle(save_traj) return False # Check that the orbit could be computed if traj.orbit.ra_g is None: log.info("The orbit could not be computed!") - self.dh.cleanupPhase2TempPickle(save_traj) return False # Set the trajectory fit as successful @@ -1015,7 +1050,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N else: log.info("The orbit could not be computed!") - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -1023,77 +1057,204 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Save the trajectory if successful. if successful_traj_fit: # restore the original traj_id so that the phase1 and phase 2 results use the same ID - if mcmode == 2: + if mcmode == MCMODE_PHASE2: traj.traj_id = saved_traj_id traj.phase_1_only = False - if mcmode == 1: + if mcmode == MCMODE_PHASE1: traj.phase_1_only = True if orig_traj: log.info(f"Removing the previous solution {os.path.dirname(orig_traj.traj_file_path)} ...") - self.dh.removeTrajectory(orig_traj) + manage_phase1 = True if abs(round((traj.jdt_ref-orig_traj.jdt_ref)*86400000,0)) > 0 else False + orig_traj.pre_mc_longname = os.path.split(self.dh.generateTrajOutputDirectoryPath(orig_traj, make_dirs=False))[-1] + self.dh.removeTrajectory(orig_traj, remove_phase1=manage_phase1) + traj.pre_mc_longname = os.path.split(self.dh.generateTrajOutputDirectoryPath(orig_traj, make_dirs=False))[-1] + # if we are in MCMODE Phase2, we do not want to save a new copy of the Phase1 file + # even if the trajectory has a slightly different ref_dt + if mcmode == MCMODE_PHASE2: + manage_phase1 = False + else: + manage_phase1 = False + log.info('Saving trajectory....') - self.dh.saveTrajectoryResults(traj, self.traj_constraints.save_plots) - if mcmode != 2: - # we do not need to update the database for phase2 - log.info('Updating database....') - self.dh.addTrajectory(traj) + self.dh.saveTrajectoryResults(traj, self.traj_constraints.save_plots, save_phase1=manage_phase1) - # Mark observations as paired in a trajectory if fit successful - if mcmode != 2 and matched_obs is not None: - for _, met_obs_temp, _ in matched_obs: - self.dh.markObservationAsPaired(met_obs_temp) + # we do not need to update the database for phase2 + if mcmode != MCMODE_PHASE2: + log.info('Updating database....') + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, verbose=verbose) + if matched_obs is not None: + self.dh.addPairedObs(matched_obs, traj.jdt_ref, verbose=verbose) else: log.info('unable to fit trajectory') return successful_traj_fit + def mergeBrokenCandidates(self, candidate_trajectories): + ### Merge all candidate trajectories which share the same observations ### + log.info("") + log.info("---------------------------") + log.info("3) MERGING BROKEN OBSERVATIONS") + log.info("---------------------------") + log.info(f"Initially {len(candidate_trajectories)} candidates") + merged_candidate_trajectories = [] + merged_indices = [] + total_obs_used = 0 + for i, traj_cand_ref in enumerate(candidate_trajectories): + + # Skip candidate trajectories that have already been merged + if i in merged_indices: + continue + + # Stop the search if the end has been reached + if (i + 1) == len(candidate_trajectories): + merged_candidate_trajectories.append(traj_cand_ref) + total_obs_used += len(traj_cand_ref) + break - def run(self, event_time_range=None, bin_time_range=None, mcmode=0): - """ Run meteor corellation using available data. - Keyword arguments: - event_time_range: [list] A list of two datetime objects. These are times between which - events should be used. None by default, which uses all available events. - mcmode: [int] flag to indicate whether or not to run monte-carlos - """ + # Get the mean time of the reference observation + ref_mean_dt = traj_cand_ref[0][1].mean_dt - # a bit of logging to let readers know what we're doing - if mcmode == 2: - mcmodestr = ' - MONTE CARLO STAGE' - elif mcmode == 1: - mcmodestr = ' - SIMPLE STAGE' - else: - mcmodestr = ' ' + obs_list_ref = [entry[1] for entry in traj_cand_ref] + merged_candidate = [] + + # Compute the mean radiant of the reference solution + plane_radiants_ref = [entry[2].radiant_eq for entry in traj_cand_ref] + ra_mean_ref = meanAngle([ra for ra, _ in plane_radiants_ref]) + dec_mean_ref = np.mean([dec for _, dec in plane_radiants_ref]) - if mcmode != 2: - # Get unpaired observations, filter out observations with too little points and sort them by time - unpaired_observations_all = self.dh.getUnpairedObservations() - unpaired_observations_all = [mettmp for mettmp in unpaired_observations_all - if len(mettmp.data) >= self.traj_constraints.min_meas_pts] - unpaired_observations_all = sorted(unpaired_observations_all, key=lambda x: x.reference_dt) - # Remove all observations done prior to 2000, to weed out those with bad time - unpaired_observations_all = [met_obs for met_obs in unpaired_observations_all - if met_obs.reference_dt > datetime.datetime(2000, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)] + # Check for pairs + found_first_pair = False + for j, traj_cand_test in enumerate(candidate_trajectories[(i + 1):]): + # Skip same observations + if traj_cand_ref[0] == traj_cand_test[0]: + continue - # Normalize all reference times and time data so that the reference time is at t = 0 s - for met_obs in unpaired_observations_all: - # Correct the reference time - t_zero = met_obs.data[0].time_rel - met_obs.reference_dt = met_obs.reference_dt + datetime.timedelta(seconds=t_zero) + # Get the mean time of the test observation + test_mean_dt = traj_cand_test[0][1].mean_dt - # Normalize all observation times so that the first time is t = 0 s - for i in range(len(met_obs.data)): - met_obs.data[i].time_rel -= t_zero + # Make sure the observations that are being compared are within the time window + time_diff = (test_mean_dt - ref_mean_dt).total_seconds() + if abs(time_diff) > self.traj_constraints.max_toffset: + continue + # Break the search if the time went beyond the search. This can be done as observations + # are ordered in time + if time_diff > self.traj_constraints.max_toffset: + break + + + + # Create a list of observations + obs_list_test = [entry[1] for entry in traj_cand_test] + + # Check if there any any common observations between candidate trajectories and merge them + # if that is the case + found_match = False + test_ids = [x.id for x in obs_list_test] + for obs1 in obs_list_ref: + if obs1.id in test_ids: + found_match = True + break + + + # Compute the mean radiant of the reference solution + plane_radiants_test = [entry[2].radiant_eq for entry in traj_cand_test] + ra_mean_test = meanAngle([ra for ra, _ in plane_radiants_test]) + dec_mean_test = np.mean([dec for _, dec in plane_radiants_test]) + + # Skip the merging attempt if the estimated radiants are too far off + if np.degrees(angleBetweenSphericalCoords(dec_mean_ref, ra_mean_ref, dec_mean_test, ra_mean_test)) > self.traj_constraints.max_merge_radiant_angle: + continue + + + # Add the candidate trajectory to the common list if a match has been found + if found_match: + + ref_stations = [obs.station_code for obs in obs_list_ref] + + # Add observations that weren't present in the reference candidate + for entry in traj_cand_test: + + # Make sure the added observation is not already added + if entry[1] not in obs_list_ref: + + # Print the reference and the merged radiants + if not found_first_pair: + log.info("") + log.info("------") + log.info("Reference time: {:s}".format(str(ref_mean_dt))) + log.info("Reference stations: {:s}".format(", ".join(sorted(ref_stations)))) + log.info("Reference radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_ref), np.degrees(dec_mean_ref))) + log.info("") + found_first_pair = True + + log.info("Merging: {:s} {:s}".format(str(entry[1].mean_dt), str(entry[1].station_code))) + traj_cand_ref.append(entry) + + log.info("Merged radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_test), np.degrees(dec_mean_test))) + + # Mark that the current index has been processed + merged_indices.append(i + j + 1) + + # Add the reference candidate observations to the list + merged_candidate += traj_cand_ref + total_obs_used += len(traj_cand_ref) + + # Add the merged observation to the final list + merged_candidate_trajectories.append(merged_candidate) + + log.info(f"After merging, there are {len(merged_candidate_trajectories)} candidates") + return merged_candidate_trajectories, total_obs_used + + + def run(self, event_time_range=None, bin_time_range=None, mcmode=MCMODE_ALL, verbose=False): + """ Run meteor corellation using available data. + + Keyword arguments: + event_time_range: [list] A list of two datetime objects. These are times between which + events should be used. None by default, which uses all available events. + mcmode: [int] flag to indicate whether or not to run monte-carlos + """ + + # a bit of logging to let readers know what we're doing + mcmodestr = getMcModeStr(mcmode, strtype=1) + + if mcmode != MCMODE_PHASE2: + if mcmode & MCMODE_CANDS: + # Get unpaired observations, filter out observations with too few points and sort them by time + unpaired_observations_all = self.dh.getUnpairedObservations() + unpaired_observations_all = [mettmp for mettmp in unpaired_observations_all + if len(mettmp.data) >= self.traj_constraints.min_meas_pts] + unpaired_observations_all = sorted(unpaired_observations_all, key=lambda x: x.reference_dt) + + # Remove all observations done prior to 2000, to weed out those with bad time + unpaired_observations_all = [met_obs for met_obs in unpaired_observations_all + if met_obs.reference_dt > datetime.datetime(2000, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)] + + # Normalize all reference times and time data so that the reference time is at t = 0 s + for met_obs in unpaired_observations_all: + + # Correct the reference time + t_zero = met_obs.data[0].time_rel + met_obs.reference_dt = met_obs.reference_dt + datetime.timedelta(seconds=t_zero) + + # Normalize all observation times so that the first time is t = 0 s + for i in range(len(met_obs.data)): + met_obs.data[i].time_rel -= t_zero + else: + event_time_range = self.dh.dt_range # If the time range was given, only use the events in that time range if event_time_range: @@ -1104,11 +1265,17 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Data will be divided into time bins, so the pairing function doesn't have to go pair many # observations at once and keep all pairs in memory else: - dt_beg = unpaired_observations_all[0].reference_dt - dt_end = unpaired_observations_all[-1].reference_dt + if mcmode & MCMODE_CANDS: + dt_beg = unpaired_observations_all[0].reference_dt + dt_end = unpaired_observations_all[-1].reference_dt + bin_days = 0.25 + else: + dt_beg, dt_end = self.dh.dt_range + bin_days = 1 + dt_bin_list = generateDatetimeBins( dt_beg, dt_end, - bin_days=1, utc_hour_break=12, tzinfo=datetime.timezone.utc, reverse=False + bin_days=bin_days, utc_hour_break=12, tzinfo=datetime.timezone.utc, reverse=False ) else: @@ -1126,6 +1293,7 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): log.info("---------------------------------") log.info("") + log.info(f'mcmode is {mcmodestr}') # Go though all time bins and split the list of observations for bin_beg, bin_end in dt_bin_list: @@ -1133,426 +1301,354 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): traj_solved_count = 0 # if we're in MC mode 0 or 1 we have to find the candidate trajectories - if mcmode < 2: - log.info("") - log.info("-----------------------------------") - log.info(" PAIRING TRAJECTORIES IN TIME BIN:") - log.info(" BIN BEG: {:s} UTC".format(str(bin_beg))) - log.info(" BIN END: {:s} UTC".format(str(bin_end))) - log.info("-----------------------------------") - log.info("") - - - # Select observations in the given time bin - unpaired_observations = [met_obs for met_obs in unpaired_observations_all - if (met_obs.reference_dt >= bin_beg) and (met_obs.reference_dt <= bin_end)] - - log.info(f'Analysing {len(unpaired_observations)} observations...') - - ### CHECK FOR PAIRING WITH PREVIOUSLY ESTIMATED TRAJECTORIES ### - - log.info("") - log.info("--------------------------------------------------------------------------") - log.info(" 1) CHECKING IF PREVIOUSLY ESTIMATED TRAJECTORIES HAVE NEW OBSERVATIONS") - log.info("--------------------------------------------------------------------------") - log.info("") - - # Get a list of all already computed trajectories within the given time bin - # Reducted trajectory objects are returned - - if bin_time_range: - # restrict checks to the bin range supplied to run() plus a day to allow for data upload times - log.info(f'Getting computed trajectories for bin {str(bin_time_range[0])} to {str(bin_time_range[1])}') - computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_time_range[0]), datetime2JD(bin_time_range[1])+1) - else: - # use the current bin. - log.info(f'Getting computed trajectories for {str(bin_beg)} to {str(bin_end)}') - computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_beg), datetime2JD(bin_end)) - - # Find all unpaired observations that match already existing trajectories - for traj_reduced in computed_traj_list: - - # If the trajectory already has more than the maximum number of stations, skip it - if len(traj_reduced.participating_stations) >= self.traj_constraints.max_stations: - - log.info( - "Trajectory {:s} has already reached the maximum number of stations, " - "skipping...".format( - str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)))) - - # TODO DECIDE WHETHER WE ACTUALLY WANT TO DO THIS - # the problem is that we could end up with unpaired observations that form a new trajectory instead of - # being added to an existing one - continue - - # Get all unprocessed observations which are close in time to the reference trajectory - traj_time_pairs = self.dh.getTrajTimePairs(traj_reduced, unpaired_observations, - self.traj_constraints.max_toffset) - - # Skip trajectory if there are no new obervations - if not traj_time_pairs: - continue - - + if mcmode != MCMODE_PHASE2: + ## we are in candidatemode mode 0 or 1 and want to find candidates + if mcmode & MCMODE_CANDS: + log.info("") + log.info("-----------------------------------") + log.info("0) PAIRING TRAJECTORIES IN TIME BIN:") + log.info(" BIN BEG: {:s} UTC".format(str(bin_beg))) + log.info(" BIN END: {:s} UTC".format(str(bin_end))) + log.info("-----------------------------------") log.info("") - log.info("Checking trajectory at {:s} in countries: {:s}".format( - str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)), - ", ".join(list(set([stat_id[:2] for stat_id in traj_reduced.participating_stations]))))) - log.info("--------") - - - # Filter out bad matches and only keep the good ones - candidate_observations = [] - traj_full = None - skip_traj_check = False - for met_obs in traj_time_pairs: - - log.info("Candidate observation: {:s}".format(met_obs.station_code)) - - platepar = self.dh.getPlatepar(met_obs) - - # Check that the trajectory beginning and end are within the distance limit - if not self.trajectoryRangeCheck(traj_reduced, platepar): - continue - - - # Check that the trajectory is within the field of view - if not self.trajectoryInFOV(traj_reduced, platepar): - continue - - - # Load the full trajectory object - if traj_full is None: - traj_full = self.dh.loadFullTraj(traj_reduced) - - # If the full trajectory couldn't be loaded, skip checking this trajectory - if traj_full is None: - - skip_traj_check = True - break - - - ### Do a rough trajectory solution and perform a quick quality control ### - - # Init observation object using the new meteor observation - obs_new = self.initObservationsObject(met_obs, platepar, - ref_dt=jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)) - - - # Get an observation from the trajectory object with the maximum convergence angle to - # the reference observations - obs_traj_best = None - qc_max = 0.0 - for obs_tmp in traj_full.observations: - - # Compute the plane intersection between the new and one of trajectory observations - pi = PlaneIntersection(obs_new, obs_tmp) - - # Take the observation with the maximum convergence angle - if (obs_traj_best is None) or (pi.conv_angle > qc_max): - qc_max = pi.conv_angle - obs_traj_best = obs_tmp - - - # Do a quick trajectory solution and perform sanity checks - plane_intersection = self.quickTrajectorySolution(obs_traj_best, obs_new) - if plane_intersection is None: - continue - - ### ### - - candidate_observations.append([obs_new, met_obs]) - - - # Skip the candidate trajectory if it couldn't be loaded from disk - if skip_traj_check: - continue - # If there are any good new observations, add them to the trajectory and re-run the solution - if candidate_observations: + # Select observations in the given time bin + unpaired_observations = [met_obs for met_obs in unpaired_observations_all + if (met_obs.reference_dt >= bin_beg) and (met_obs.reference_dt <= bin_end)] - log.info("Recomputing trajectory with new observations from stations:") + total_unpaired = len(unpaired_observations) + log.info(f'Analysing {total_unpaired} observations in this bucket...') + num_obs_paired = 0 - # Add new observations to the trajectory object - for obs_new, _ in candidate_observations: - log.info(obs_new.station_id) - traj_full.infillWithObs(obs_new) + # List of all candidate trajectories + candidate_trajectories = [] + ### CHECK FOR PAIRING WITH PREVIOUSLY ESTIMATED TRAJECTORIES ### + if total_unpaired > 0: + log.info("") + log.info("--------------------------------------------------------------------------") + log.info(" 1) CHECKING IF PREVIOUSLY ESTIMATED TRAJECTORIES HAVE NEW OBSERVATIONS") + log.info("--------------------------------------------------------------------------") + log.info("") - # Re-run the trajectory fit - # pass in orig_traj here so that it can be deleted from disk if the new solution succeeds - successful_traj_fit = self.solveTrajectory(traj_full, traj_full.mc_runs, mcmode=mcmode, orig_traj=traj_reduced) + # Get a list of all already computed trajectories within the given time bin + # Reducted trajectory objects are returned - # If the new trajectory solution succeeded, remove the now-paired observations - if successful_traj_fit: - - log.info("Remove paired observations from the processing list...") - for _, met_obs_temp in candidate_observations: - self.dh.markObservationAsPaired(met_obs_temp) - unpaired_observations.remove(met_obs_temp) - + if bin_time_range: + # restrict checks to the bin range supplied to run() plus a day to allow for data upload times + log.info(f'Getting computed trajectories for bin {str(bin_time_range[0])} to {str(bin_time_range[1])}') + computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_time_range[0]), datetime2JD(bin_time_range[1])+1) else: - log.info("New trajectory solution failed, keeping the old trajectory...") - - ### ### + # use the current bin. + log.info(f'Getting computed trajectories for {str(bin_beg)} to {str(bin_end)}') + computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_beg), datetime2JD(bin_end)) + # Find all unpaired observations that match already existing trajectories + for traj_reduced in computed_traj_list: - log.info("") - log.info("-------------------------------------------------") - log.info(" 2) PAIRING OBSERVATIONS INTO NEW TRAJECTORIES") - log.info("-------------------------------------------------") - log.info("") + # If the trajectory already has more than the maximum number of stations, skip it + if len(traj_reduced.participating_stations) >= self.traj_constraints.max_stations: - # List of all candidate trajectories - candidate_trajectories = [] + log.info( + "Trajectory {:s} has already reached the maximum number of stations, " + "skipping...".format( + str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)))) - # Go through all unpaired and unprocessed meteor observations - for met_obs in unpaired_observations: - - # Skip observations that were processed in the meantime - if met_obs.processed: - continue + # TODO DECIDE WHETHER WE ACTUALLY WANT TO DO THIS + # the problem is that we could end up with unpaired observations that form a new trajectory instead of + # being added to an existing one + continue + + # Get all unprocessed observations which are close in time to the reference trajectory + traj_time_pairs = self.dh.getTrajTimePairs(traj_reduced, unpaired_observations, + self.traj_constraints.max_toffset) - # Get station platepar - reference_platepar = self.dh.getPlatepar(met_obs) - obs1 = self.initObservationsObject(met_obs, reference_platepar) + # Skip trajectory if there are no new obervations + if not traj_time_pairs: + continue - # Keep a list of observations which matched the reference observation - matched_observations = [] + log.info("") + log.info("Checking trajectory at {:s} in countries: {:s}".format( + str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)), + ", ".join(list(set([stat_id[:2] for stat_id in traj_reduced.participating_stations]))))) + log.info("--------") - # Find all meteors from other stations that are close in time to this meteor - plane_intersection_good = None - time_pairs = self.dh.findTimePairs(met_obs, unpaired_observations, - self.traj_constraints.max_toffset) - for met_pair_candidate in time_pairs: - log.info("") - log.info("Processing pair:") - log.info("{:s} and {:s}".format(met_obs.station_code, met_pair_candidate.station_code)) - log.info("{:s} and {:s}".format(str(met_obs.reference_dt), str(met_pair_candidate.reference_dt))) - log.info("-----------------------") + # Filter out bad matches and only keep the good ones + candidate_observations = [] + traj_full = None + skip_traj_check = False + for met_obs in traj_time_pairs: - ### Check if the stations are close enough and have roughly overlapping fields of view ### + log.info("Candidate observation: {:s}".format(met_obs.station_code)) - # Get candidate station platepar - candidate_platepar = self.dh.getPlatepar(met_pair_candidate) + platepar = self.dh.getPlatepar(met_obs) - # Check if the stations are within range - if not self.stationRangeCheck(reference_platepar, candidate_platepar): - continue + # Check that the trajectory beginning and end are within the distance limit + if not self.trajectoryRangeCheck(traj_reduced, platepar): + continue - # Check the FOV overlap - if not self.checkFOVOverlap(reference_platepar, candidate_platepar): - log.info("Station FOV does not overlap: {:s} and {:s}".format(met_obs.station_code, - met_pair_candidate.station_code)) - continue - ### ### + # Check that the trajectory is within the field of view + if not self.trajectoryInFOV(traj_reduced, platepar): + continue + # Load the full trajectory object + if traj_full is None: + traj_full = self.dh.loadFullTraj(traj_reduced) - ### Do a rough trajectory solution and perform a quick quality control ### + # If the full trajectory couldn't be loaded, skip checking this trajectory + if traj_full is None: + + skip_traj_check = True + break - # Init observations - obs2 = self.initObservationsObject(met_pair_candidate, candidate_platepar, - ref_dt=met_obs.reference_dt) - # Do a quick trajectory solution and perform sanity checks - plane_intersection = self.quickTrajectorySolution(obs1, obs2) - if plane_intersection is None: - continue + ### Do a rough trajectory solution and perform a quick quality control ### - else: - plane_intersection_good = plane_intersection + # Init observation object using the new meteor observation + obs_new = self.initObservationsObject(met_obs, platepar, + ref_dt=jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)) + obs_new.station_code = met_obs.station_code + obs_new.mean_dt = met_obs.mean_dt - ### ### + # Get an observation from the trajectory object with the maximum convergence angle to + # the reference observations + obs_traj_best = None + qc_max = 0.0 + for obs_tmp in traj_full.observations: + + # Compute the plane intersection between the new and one of trajectory observations + pi = PlaneIntersection(obs_new, obs_tmp) - matched_observations.append([obs2, met_pair_candidate, plane_intersection]) + # Take the observation with the maximum convergence angle + if (obs_traj_best is None) or (pi.conv_angle > qc_max): + qc_max = pi.conv_angle + obs_traj_best = obs_tmp + # Do a quick trajectory solution and perform sanity checks + plane_intersection = self.quickTrajectorySolution(obs_traj_best, obs_new) + if plane_intersection is None: + continue - # If there are no matched observations, skip it - if len(matched_observations) == 0: + ### ### - if len(time_pairs) > 0: - log.info("") - log.info(" --- NO MATCH ---") + candidate_observations.append([obs_new, met_obs]) - continue - # Skip if there are not good plane intersections - if plane_intersection_good is None: - continue + # Skip the candidate trajectory if it couldn't be loaded from disk + if skip_traj_check: + continue - # Add the first observation to matched observations - matched_observations.append([obs1, met_obs, plane_intersection_good]) + # If there are any good new observations, add them to the trajectory and re-run the solution + if candidate_observations: - # Mark observations as processed - for _, met_obs_temp, _ in matched_observations: - met_obs_temp.processed = True - self.dh.markObservationAsProcessed(met_obs_temp) + log.info("Recomputing trajectory with new observations:") + # Add new observations to the trajectory object + for obs_new, _ in candidate_observations: + log.info(f' {obs_new.obs_id}') + traj_full.infillWithObs(obs_new) - # Store candidate trajectories - log.info("") - log.info(" --- ADDING CANDIDATE ---") - candidate_trajectories.append(matched_observations) + # Re-run the trajectory fit + # pass in orig_traj here so that it can be deleted from disk if the new solution succeeds + # pass the new candidates in so that they can be marked paired if the new soln succeeds + # Note: mcmode must be phase1 here to force a recompute + successful_traj_fit = self.solveTrajectory(traj_full, traj_full.mc_runs, mcmode=MCMODE_PHASE1, + matched_obs=candidate_observations, orig_traj=traj_reduced, verbose=verbose) + + # If the new trajectory solution succeeded, remove the now-paired observations from the in memory list + if successful_traj_fit: + log.info("Remove paired observations from the processing list...") + for _, met_obs_temp in candidate_observations: + unpaired_observations.remove(met_obs_temp) - ### Merge all candidate trajectories which share the same observations ### - log.info("") - log.info("---------------------------") - log.info("MERGING BROKEN OBSERVATIONS") - log.info("---------------------------") - merged_candidate_trajectories = [] - merged_indices = [] - for i, traj_cand_ref in enumerate(candidate_trajectories): - - # Skip candidate trajectories that have already been merged - if i in merged_indices: - continue + else: + log.info("New trajectory solution failed, keeping the old trajectory...") - - # Stop the search if the end has been reached - if (i + 1) == len(candidate_trajectories): - merged_candidate_trajectories.append(traj_cand_ref) - break + ### ### - # Get the mean time of the reference observation - ref_mean_dt = traj_cand_ref[0][1].mean_dt + log.info("") + log.info("-------------------------------------------------") + log.info(" 2) PAIRING OBSERVATIONS INTO NEW TRAJECTORIES") + log.info("-------------------------------------------------") + log.info("") - obs_list_ref = [entry[1] for entry in traj_cand_ref] - merged_candidate = [] - # Compute the mean radiant of the reference solution - plane_radiants_ref = [entry[2].radiant_eq for entry in traj_cand_ref] - ra_mean_ref = meanAngle([ra for ra, _ in plane_radiants_ref]) - dec_mean_ref = np.mean([dec for _, dec in plane_radiants_ref]) + # Go through all unpaired and unprocessed meteor observations + for met_obs in unpaired_observations: + # Skip observations that were processed in the meantime + if met_obs.processed: + continue - # Check for pairs - found_first_pair = False - for j, traj_cand_test in enumerate(candidate_trajectories[(i + 1):]): + if self.dh.checkIfObsPaired(met_obs.id, verbose=verbose): + continue - # Skip same observations - if traj_cand_ref[0] == traj_cand_test[0]: - continue + # Get station platepar + reference_platepar = self.dh.getPlatepar(met_obs) + obs1 = self.initObservationsObject(met_obs, reference_platepar) - # Get the mean time of the test observation - test_mean_dt = traj_cand_test[0][1].mean_dt + # Keep a list of observations which matched the reference observation + matched_observations = [] - # Make sure the observations that are being compared are within the time window - time_diff = (test_mean_dt - ref_mean_dt).total_seconds() - if abs(time_diff) > self.traj_constraints.max_toffset: - continue + # Find all meteors from other stations that are close in time to this meteor + plane_intersection_good = None + time_pairs = self.dh.findTimePairs(met_obs, unpaired_observations, + self.traj_constraints.max_toffset) + for met_pair_candidate in time_pairs: + log.info("") + log.info("Processing pair:") + log.info("{:s} and {:s}".format(met_obs.station_code, met_pair_candidate.station_code)) + log.info("{:s} and {:s}".format(str(met_obs.reference_dt), str(met_pair_candidate.reference_dt))) + log.info("-----------------------") - # Break the search if the time went beyond the search. This can be done as observations - # are ordered in time - if time_diff > self.traj_constraints.max_toffset: - break + ### Check if the stations are close enough and have roughly overlapping fields of view ### + # Get candidate station platepar + candidate_platepar = self.dh.getPlatepar(met_pair_candidate) + # Check if the stations are within range + if not self.stationRangeCheck(reference_platepar, candidate_platepar): + continue - # Create a list of observations - obs_list_test = [entry[1] for entry in traj_cand_test] + # Check the FOV overlap + if not self.checkFOVOverlap(reference_platepar, candidate_platepar): + log.info("Station FOV does not overlap: {:s} and {:s}".format(met_obs.station_code, + met_pair_candidate.station_code)) + continue - # Check if there any any common observations between candidate trajectories and merge them - # if that is the case - found_match = False - for obs1 in obs_list_ref: - if obs1 in obs_list_test: - found_match = True - break + ### ### - # Compute the mean radiant of the reference solution - plane_radiants_test = [entry[2].radiant_eq for entry in traj_cand_test] - ra_mean_test = meanAngle([ra for ra, _ in plane_radiants_test]) - dec_mean_test = np.mean([dec for _, dec in plane_radiants_test]) - # Skip the mergning attempt if the estimated radiants are too far off - if np.degrees(angleBetweenSphericalCoords(dec_mean_ref, ra_mean_ref, dec_mean_test, ra_mean_test)) > self.traj_constraints.max_merge_radiant_angle: + ### Do a rough trajectory solution and perform a quick quality control ### - continue + # Init observations + obs2 = self.initObservationsObject(met_pair_candidate, candidate_platepar, + ref_dt=met_obs.reference_dt) + # Do a quick trajectory solution and perform sanity checks + plane_intersection = self.quickTrajectorySolution(obs1, obs2) + if plane_intersection is None: + continue - # Add the candidate trajectory to the common list if a match has been found - if found_match: + else: + plane_intersection_good = plane_intersection - ref_stations = [obs.station_code for obs in obs_list_ref] + ### ### - # Add observations that weren't present in the reference candidate - for entry in traj_cand_test: + matched_observations.append([obs2, met_pair_candidate, plane_intersection]) - # Make sure the added observation is not from a station that's already added - if entry[1].station_code in ref_stations: - continue - if entry[1] not in obs_list_ref: - # Print the reference and the merged radiants - if not found_first_pair: - log.info("") - log.info("------") - log.info("Reference time: {:s}".format(str(ref_mean_dt))) - log.info("Reference stations: {:s}".format(", ".join(sorted(ref_stations)))) - log.info("Reference radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_ref), np.degrees(dec_mean_ref))) - log.info("") - found_first_pair = True + # If there are no matched observations, skip it + if len(matched_observations) == 0: - log.info("Merging: {:s} {:s}".format(str(entry[1].mean_dt), str(entry[1].station_code))) - traj_cand_ref.append(entry) + if len(time_pairs) > 0: + log.info("") + log.info(" --- NO MATCH ---") - log.info("Merged radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_test), np.degrees(dec_mean_test))) + continue - + # Skip if there are not good plane intersections + if plane_intersection_good is None: + continue + # Add the first observation to matched observations + matched_observations.append([obs1, met_obs, plane_intersection_good]) - # Mark that the current index has been processed - merged_indices.append(i + j + 1) + # Mark observations as processed + for _, met_obs_temp, _ in matched_observations: + met_obs_temp.processed = True - # Add the reference candidate observations to the list - merged_candidate += traj_cand_ref + # Store candidate trajectory group + # Note that this will include candidate groups that already failed on previous runs. + # We will exclude these later - we can't do it just yet as if new data has arrived, then + # in the next step, the group might be merged with another group creating a solvable set. + log.info("") + ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) + log.info(f" --- ADDING CANDIDATE at {ref_dt.isoformat()} ---") + candidate_trajectories.append(matched_observations) + # Check for mergeable candidate combinations + merged_candidate_trajectories, num_obs_paired = self.mergeBrokenCandidates(candidate_trajectories) + candidate_trajectories = merged_candidate_trajectories - # Add the merged observation to the final list - merged_candidate_trajectories.append(merged_candidate) + log.info("-----------------------") + log.info(f'There are {total_unpaired - num_obs_paired} remaining unpaired observations in this bucket.') + log.info("-----------------------") + # in candidate mode we want to save the candidates to disk + if mcmode == MCMODE_CANDS: + log.info("-----------------------") + if bin_time_range: + log.info(f'5) SAVING {len(candidate_trajectories)} CANDIDATES for {str(bin_time_range[0])} to {str(bin_time_range[1])}') + else: + log.info(f'5) SAVING {len(candidate_trajectories)} CANDIDATES for {str(bin_beg)} to {str(bin_end)}') + log.info("-----------------------") + # Save candidates. This will check and skip over already-processed + # combinations + self.dh.saveCandidates(candidate_trajectories, verbose=verbose) - candidate_trajectories = merged_candidate_trajectories + return len(candidate_trajectories) + + else: + log.info("-----------------------") + log.info('5) PROCESSING {} CANDIDATES'.format(len(candidate_trajectories))) + log.info("-----------------------") + # end of 'if mcmode & MCMODE_CANDS' ### ### - else: + else: + # candidatemode is LOAD so load any available candidates for processing + traj_solved_count = 0 + log.info("-----------------------") + log.info('6) LOADING CANDIDATES') + log.info("-----------------------") + candidate_trajectories = self.dh.loadCandidates(verbose=verbose) + + # end of 'self.candidatemode == CANDMODE_LOAD' + # end of 'if mcmode != MCMODE_PHASE2' + else: + # mcmode == MCMODE_PHASE2 so we need to load the phase1 solutions log.info("-----------------------") - log.info('LOADING PHASE1 SOLUTIONS') + log.info('6) LOADING PHASE1 SOLUTIONS') log.info("-----------------------") candidate_trajectories = self.dh.phase1Trajectories - # end of "if mcmode < 2" + # end of "if mcmode == MCMODE_PHASE2" + + # avoid reprocessing candidates that were already processed + num_traj = len(candidate_trajectories) log.info("") log.info("-----------------------") - log.info(f'SOLVING {len(candidate_trajectories)} TRAJECTORIES {mcmodestr}') + log.info(f'7) SOLVING {num_traj} TRAJECTORIES {mcmodestr}') log.info("-----------------------") log.info("") # Go through all candidate trajectories and compute the complete trajectory solution - for matched_observations in candidate_trajectories: + for i, matched_observations in enumerate(candidate_trajectories): log.info("") log.info("-----------------------") - + cand_id = self.dh.getCandidateId(matched_observations) if mcmode==MCMODE_PHASE1 else '' + log.info(f'processing {"candidate" if mcmode==MCMODE_PHASE1 else "trajectory"} {cand_id} {i+1}/{num_traj}') # if mcmode is not 2, prepare to calculate the intersecting planes solutions - if mcmode != 2: + if mcmode != MCMODE_PHASE2: # Find unique station counts station_counts = np.unique([entry[1].station_code for entry in matched_observations], return_counts=True) @@ -1609,10 +1705,9 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Print info about observations which are being solved log.info("") - log.info("Observations:") - for entry in matched_observations: - obs, met_obs, _ = entry - log.info(f'{met_obs.station_code} - {met_obs.mean_dt} - {obs.ignore_station}') + log.info("Observations and ignore flag:") + for obs, _, _ in matched_observations: + log.info(f' {obs.obs_id} - {obs.ignore_station}') @@ -1622,6 +1717,23 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): log.info("Max convergence angle too small: {:.1f} < {:.1f} deg".format(qc_max, self.traj_constraints.min_qc)) + # create a traj object to add to the failed database so we don't try to recompute this one again + ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) + jdt_ref = datetime2JD(ref_dt) + + failed_traj = self.initTrajectory(jdt_ref, 0, verbose=verbose) + for obs_temp, met_obs, _ in matched_observations: + failed_traj.infillWithObs(obs_temp) + + failed_traj.obs_ids = [obs_temp.obs_id for obs_temp, _,_ in matched_observations] + + t0 = min([obs.time_data[0] for obs in failed_traj.observations if (not obs.ignore_station) + or (not np.all(obs.ignore_list))]) + if t0 != 0.0: + failed_traj.jdt_ref = failed_traj.jdt_ref + t0/86400.0 + + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + self.dh.addTrajectory(failed_traj, failed_traj.jdt_ref, verbose=verbose) continue @@ -1649,20 +1761,26 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Init the solver (use the earliest date as the reference) - ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) - jdt_ref = datetime2JD(ref_dt) - traj = self.initTrajectory(jdt_ref, mc_runs, verbose=False) + jdt_ref = min([obs_temp.jdt_ref for obs_temp, _, _ in matched_observations]) + + #log.info(f'ref_dt {jd2Date(jdt_ref, dt_obj=True)}') + traj = self.initTrajectory(jdt_ref, mc_runs, verbose=verbose) # Feed the observations into the trajectory solver for obs_temp, met_obs, _ in matched_observations: # Normalize the observations to the reference Julian date - jdt_ref_curr = datetime2JD(met_obs.reference_dt) + jdt_ref_curr = obs_temp.jdt_ref # datetime2JD(met_obs.reference_dt) obs_temp.time_data += (jdt_ref_curr - jdt_ref)*86400 - + # we have normalised the time data to jdt_ref, now we need to reset jdt_ref for each obs too + obs_temp.jdt_ref = jdt_ref + obs_temp.obs_id = obs_temp.id traj.infillWithObs(obs_temp) + traj.obs_ids = [obs.obs_id for obs, _,_ in matched_observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs, _,_ in matched_observations if obs.ignore_station is True] + ### Recompute the reference JD and all times so that the first time starts at 0 ### # Determine the first relative time from reference JD @@ -1671,29 +1789,30 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # If the first time is not 0, normalize times so that the earliest time is 0 if t0 != 0.0: - + #log.info(f'adjusting by {t0}') # Offset all times by t0 for i in range(len(traj.observations)): traj.observations[i].time_data -= t0 - + # log.info(f'obs jdt_ref is {jd2Date(traj.observations[i].jdt_ref, dt_obj=True)}') # Recompute the reference JD to corresponds with t0 traj.jdt_ref = traj.jdt_ref + t0/86400.0 + #log.info(f'ref_dt {jd2Date(traj.jdt_ref, dt_obj=True)}') # If this trajectory already failed to be computed, don't try to recompute it again unless # new observations are added if self.dh.checkTrajIfFailed(traj): log.info("The same trajectory already failed to be computed in previous runs!") continue - # pass in matched_observations here so that solveTrajectory can mark them paired if they're used - result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations) + # pass in matched_observations here so that we can mark them paired if they're used + result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, verbose=verbose) traj_solved_count += int(result) - # end of if mcmode != 2 + # end of if mcmode != MCMODE_PHASE2 else: - # mcmode is 2 and so we have a list of trajectories that were solved in phase 1 + # mcmode is MCMODE_PHASE2 and so we have a list of trajectories that were solved in phase 1 # to prepare for monte-carlo solutions traj = matched_observations @@ -1717,18 +1836,18 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # This will increase the number of MC runs while keeping the processing time the same mc_runs = int(np.ceil(mc_runs/self.traj_constraints.mc_cores)*self.traj_constraints.mc_cores) - # pass in matched_observations here so that solveTrajectory can mark them paired if they're used - result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, orig_traj=traj) + # pass in matched_observations here so that we can mark them unpaired if the solver fails + result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, orig_traj=traj, verbose=verbose) traj_solved_count += int(result) # end of "for matched_observations in candidate_trajectories" outcomes = [traj_solved_count] - # Finish the correlation run (update the database with new values) - self.dh.saveDatabase() log.info(f'SOLVED {sum(outcomes)} TRAJECTORIES') log.info("") log.info("-----------------") log.info("SOLVING RUN DONE!") log.info("-----------------") + + return sum(outcomes) diff --git a/wmpl/Trajectory/CorrelateRMS.py b/wmpl/Trajectory/CorrelateRMS.py index 88c11292..89831693 100644 --- a/wmpl/Trajectory/CorrelateRMS.py +++ b/wmpl/Trajectory/CorrelateRMS.py @@ -20,14 +20,20 @@ import pandas as pd from dateutil.relativedelta import relativedelta import numpy as np +import sys +import secrets from wmpl.Formats.CAMS import loadFTPDetectInfo -from wmpl.Trajectory.CorrelateEngine import TrajectoryCorrelator, TrajectoryConstraints +from wmpl.Trajectory.CorrelateEngine import TrajectoryCorrelator, TrajectoryConstraints, getMcModeStr from wmpl.Utils.Math import generateDatetimeBins from wmpl.Utils.OSTools import mkdirP from wmpl.Utils.Pickling import loadPickle, savePickle from wmpl.Utils.TrajConversions import datetime2JD, jd2Date -from wmpl.Utils.remoteDataHandling import collectRemoteTrajectories, moveRemoteTrajectories, uploadTrajToRemote +from wmpl.Utils.remoteDataHandling import RemoteDataHandler +from wmpl.Trajectory.CorrelateDB import ObservationsDatabase, TrajectoryDatabase, CandidateDatabase +# from wmpl.Trajectory.Trajectory import Trajectory + +from wmpl.Trajectory.CorrelateEngine import MCMODE_CANDS, MCMODE_PHASE1, MCMODE_PHASE2, MCMODE_ALL, MCMODE_BOTH ### CONSTANTS ### @@ -77,6 +83,10 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): except FileNotFoundError: log.info("Pickle file not found: " + traj_file_path) return None + + except: + log.info("Pickle file could not be loaded: " + traj_file_path) + return None else: @@ -84,7 +94,6 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): traj = traj_obj self.traj_file_path = os.path.join(traj.output_dir, traj.file_name + "_trajectory.pickle") - # Reference Julian date (beginning of the meteor) self.jdt_ref = traj.jdt_ref @@ -138,21 +147,25 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): if hasattr(traj, 'traj_id'): self.traj_id = traj.traj_id + self.obs_ids = None + if hasattr(traj, 'obs_ids'): + self.obs_ids = traj.obs_ids + self.ign_obs_ids = None + if hasattr(traj, 'ign_obs_ids'): + self.ign_obs_ids = traj.ign_obs_ids + # Load values from a dictionary else: + if not hasattr(json_dict, 'obs_ids'): + json_dict['obs_ids'] = None self.__dict__ = json_dict - class DatabaseJSON(object): def __init__(self, db_file_path, verbose=False): self.db_file_path = db_file_path - # List of processed directories (keys are station codes, values are relative paths to night - # directories) - self.processed_dirs = {} - # List of paired observations as a part of a trajectory (keys are station codes, values are unique # observation IDs) self.paired_obs = {} @@ -168,7 +181,6 @@ def __init__(self, db_file_path, verbose=False): # Load the database from a JSON file self.load(verbose=verbose) - def load(self, verbose=False): """ Load the database from a JSON file. """ @@ -202,14 +214,14 @@ def load(self, verbose=False): # Overwrite the database path with the saved one self.db_file_path = db_file_path_saved - if db_is_ok: + # if the trajectories attribute is not present, then the database has been converted to sqlite + if db_is_ok and hasattr(self, 'trajectories'): # Convert trajectories from JSON to TrajectoryReduced objects for traj_dict_str in ["trajectories", "failed_trajectories"]: traj_dict = getattr(self, traj_dict_str) trajectories_obj_dict = {} for traj_json in traj_dict: traj_reduced_tmp = TrajectoryReduced(None, json_dict=traj_dict[traj_json]) - trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp # Set the trajectory dictionary @@ -219,159 +231,6 @@ def load(self, verbose=False): self.verbose = verbose - def save(self): - """ Save the database of processed meteors to disk. """ - - # Back up the existing data base - db_bak_file_path = self.db_file_path + ".bak" - if os.path.exists(self.db_file_path): - shutil.copy2(self.db_file_path, db_bak_file_path) - - # Save the data base - try: - with open(self.db_file_path, 'w') as f: - self2 = copy.deepcopy(self) - - # Convert reduced trajectory objects to JSON objects - self2.trajectories = {key: self.trajectories[key].__dict__ for key in self.trajectories} - self2.failed_trajectories = {key: self.failed_trajectories[key].__dict__ - for key in self.failed_trajectories} - if hasattr(self2, 'phase1Trajectories'): - delattr(self2, 'phase1Trajectories') - - f.write(json.dumps(self2, default=lambda o: o.__dict__, indent=4, sort_keys=True)) - - # Remove the backup file - if os.path.exists(db_bak_file_path): - os.remove(db_bak_file_path) - - except Exception as e: - log.warning('unable to save the database, likely corrupt data') - shutil.copy2(db_bak_file_path, self.db_file_path) - log.warning(e) - - def addProcessedDir(self, station_name, rel_proc_path): - """ Add the processed directory to the list. """ - - if station_name in self.processed_dirs: - if rel_proc_path not in self.processed_dirs[station_name]: - self.processed_dirs[station_name].append(rel_proc_path) - - - def addPairedObservation(self, met_obs): - """ Mark the given meteor observation as paired in a trajectory. """ - - if met_obs.station_code not in self.paired_obs: - self.paired_obs[met_obs.station_code] = [] - - if met_obs.id not in self.paired_obs[met_obs.station_code]: - self.paired_obs[met_obs.station_code].append(met_obs.id) - - - def checkObsIfPaired(self, met_obs): - """ Check if the given observation has been paired to a trajectory or not. """ - - if met_obs.station_code in self.paired_obs: - return (met_obs.id in self.paired_obs[met_obs.station_code]) - - else: - return False - - - def checkTrajIfFailed(self, traj): - """ Check if the given trajectory has been computed with the same observations and has failed to be - computed before. - - """ - - # Check if the reference time is in the list of failed trajectories - if traj.jdt_ref in self.failed_trajectories: - - # Get the failed trajectory object - failed_traj = self.failed_trajectories[traj.jdt_ref] - - # Check if the same observations participate in the failed trajectory as in the trajectory that - # is being tested - all_match = True - for obs in traj.observations: - - if not ((obs.station_id in failed_traj.participating_stations) or (obs.station_id in failed_traj.ignored_stations)): - - all_match = False - break - - # If the same stations were used, the trajectory estimation failed before - if all_match: - return True - - - return False - - - def addTrajectory(self, traj_file_path, traj_obj=None, failed=False): - """ Add a computed trajectory to the list. - - Arguments: - traj_file_path: [str] Full path the trajectory object. - - Keyword arguments: - traj_obj: [bool] Instead of loading a traj object from disk, use the given object. - failed: [bool] Add as a failed trajectory. False by default. - """ - - # Load the trajectory from disk - if traj_obj is None: - - # Init the reduced trajectory object - traj_reduced = TrajectoryReduced(traj_file_path) - if self.verbose: - log.info(f' loaded {traj_file_path}, traj_id {traj_reduced.traj_id}') - # Skip if failed - if traj_reduced is None: - return None - - if not hasattr(traj_reduced, "jdt_ref"): - return None - - else: - # Use the provided trajectory object - traj_reduced = traj_obj - if self.verbose: - log.info(f' loaded {traj_obj.traj_file_path}, traj_id {traj_reduced.traj_id}') - - - # Choose to which dictionary the trajectory will be added - if failed: - traj_dict = self.failed_trajectories - - else: - traj_dict = self.trajectories - - - # Add the trajectory to the list (key is the reference JD) - if traj_reduced.jdt_ref not in traj_dict: - traj_dict[traj_reduced.jdt_ref] = traj_reduced - else: - traj_dict[traj_reduced.jdt_ref].traj_id = traj_reduced.traj_id - - - - def removeTrajectory(self, traj_reduced, keepFolder=False): - """ Remove the trajectory from the data base and disk. """ - - # Remove the trajectory data base entry - if traj_reduced.jdt_ref in self.trajectories: - del self.trajectories[traj_reduced.jdt_ref] - - # Remove the trajectory folder on the disk - if not keepFolder and os.path.isfile(traj_reduced.traj_file_path): - traj_dir = os.path.dirname(traj_reduced.traj_file_path) - shutil.rmtree(traj_dir, ignore_errors=True) - if os.path.isfile(traj_reduced.traj_file_path): - log.info(f'unable to remove {traj_dir}') - - - class MeteorPointRMS(object): def __init__(self, frame, time_rel, x, y, ra, dec, azim, alt, mag): """ Container for individual meteor picks. """ @@ -399,7 +258,6 @@ def __init__(self, frame, time_rel, x, y, ra, dec, azim, alt, mag): self.mag = mag - class MeteorObsRMS(object): def __init__(self, station_code, reference_dt, platepar, data, rel_proc_path, ff_name=None): """ Container for meteor observations with the interface compatible with the trajectory correlator @@ -505,6 +363,7 @@ def __init__(self, station_code, reference_dt, platepar, data, rel_proc_path, ff checksum = int(np.sum([entry.x for entry in self.data]) % 10000) self.id = "{:s}_{:s}_{:04d}".format(self.station_code, self.mean_dt.strftime("%Y%m%d-%H%M%S.%f"), checksum) + self.obs_id = self.id @@ -517,7 +376,8 @@ def __init__(self, **entries): class RMSDataHandle(object): - def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode=0, max_trajs=1000, remotehost=None, verbose=False): + def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode=MCMODE_ALL, max_trajs=1000, + verbose=False, archivemonths=3, auto=False, max_toffset=10): """ Handles data interfacing between the trajectory correlator and RMS data files on disk. Arguments: @@ -530,12 +390,22 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode database file will be loaded from the dir_path. output_dir: [str] Path to the directory where the output files will be saved. None by default, in which case the output files will be saved in the dir_path. + mcmode: [int] the operation mode, candidates, phase1 simple solns, mc phase or a combination max_trajs: [int] maximum number of phase1 trajectories to load at a time when adding uncertainties. Improves throughput. """ self.mc_mode = mcmode + self.auto_mode = auto + + # max diff between observations - used when loading observations to make sure we don't miss any + # towards the end of the time bucket + self.max_toffset = max_toffset + self.dir_path = dir_path + # create the data directory. Of course, if the folder doesnt exist there is nothing to process + # but by creating it we avoid an Exception later. And we can always copy data in. + mkdirP(dir_path) self.dt_range = dt_range @@ -559,15 +429,29 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode # Create the output directory if it doesn't exist mkdirP(self.output_dir) - # Phase 1 trajectory pickle directory needed to reload previous results. + if dt_range is None or dt_range[0] == datetime.datetime(2000,1,1,0,0,0).replace(tzinfo=datetime.timezone.utc): + daysback = 14 + else: + daysback = (datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - dt_range[0]).days + 1 + + # Candidate directory, if running in create or load cands modes + self.candidate_dir = os.path.join(self.output_dir, 'candidates') + mkdirP(os.path.join(self.candidate_dir, 'processed')) + + # Phase 1 trajectory pickle directory needed to reload previous results when running phase2. self.phase1_dir = os.path.join(self.output_dir, 'phase1') + mkdirP(os.path.join(self.phase1_dir, 'processed')) - # create the directory for phase1 simple trajectories, if needed - if self.mc_mode > 0: - mkdirP(os.path.join(self.phase1_dir, 'processed')) - self.purgePhase1ProcessedData(os.path.join(self.phase1_dir, 'processed')) + # Clear down candidates older than daysback days to save space + num_removed_cands = self.purgeProcessedData(os.path.join(self.candidate_dir, 'processed'), days_back=daysback, verbose=verbose) + log.info(f'removed {num_removed_cands} processed candidates') - self.remotehost = remotehost + # Clear down phase1 older than 2x daysback days to save space + num_removed_ph1 = self.purgeProcessedData(os.path.join(self.phase1_dir, 'processed'), days_back=daysback*2, verbose=verbose) + log.info(f'removed {num_removed_ph1} processed phase1') + + # In a previous incarnation, if the solver crashed it could leave some `.pickle_processing files`. + self.cleanupPartialProcessing() self.verbose = verbose @@ -575,40 +459,74 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode # Load database of processed folders database_path = os.path.join(self.db_dir, JSON_DB_NAME) + + # create an empty processing list + self.processing_list = [] + + # maximum number of candidates or trajectories to load in one go. Should improve performance + self.max_trajs = max_trajs + log.info("") - # move any remotely calculated pickles to their target locations - if os.path.isdir(os.path.join(self.output_dir, 'remoteuploads')): - moveRemoteTrajectories(self.output_dir) - - if mcmode != 2: - log.info("Loading database: {:s}".format(database_path)) - self.db = DatabaseJSON(database_path, verbose=self.verbose) - log.info('Archiving older entries....') - try: - self.archiveOldRecords(older_than=3) - except: - pass - log.info(" ... done!") - # Load the list of stations - station_list = self.loadStations() + if mcmode != MCMODE_PHASE2: - # Find unprocessed meteor files - log.info("") - log.info("Finding unprocessed data...") - self.processing_list = self.findUnprocessedFolders(station_list) - log.info(" ... done!") + # no need to load the legacy JSON file if we already have the sqlite databases + if not os.path.isfile(os.path.join(db_dir, 'observations.db')) and \ + not os.path.isfile(os.path.join(db_dir, 'trajectories.db')) and \ + os.path.isfile(database_path): + log.info("Loading old JSON database: {:s}".format(database_path)) + self.old_db = DatabaseJSON(database_path, verbose=self.verbose) + else: + self.old_db = None + + self.observations_db = ObservationsDatabase(db_dir) + if hasattr(self.old_db, 'paired_obs'): + # copy any legacy paired obs data into sqlite + self.observations_db.copyObsJsonRecords(self.old_db.paired_obs, dt_range) + + self.trajectory_db = TrajectoryDatabase(db_dir) + if hasattr(self.old_db, 'failed_trajectories'): + # copy any legacy failed traj data into sqlite, so we avoid recomputing them + self.trajectory_db.copyTrajJsonRecords(self.old_db.failed_trajectories, dt_range, failed=True) + + if self.old_db: + del self.old_db + + if archivemonths != 0: + log.info('Archiving older entries....') + try: + self.archiveOldRecords(older_than=archivemonths) + except: + pass + log.info(" ... done!") + + if mcmode & MCMODE_CANDS: + # Load the list of stations + station_list = self.loadStations() + + # Find unprocessed meteor files + log.info("") + log.info("Finding unprocessed data...") + self.processing_list = self.findUnprocessedFolders(station_list) + log.info(" ... done!") + + # in phase 1, initialise and collect data second as we load candidates dynamically + self.initialiseRemoteDataHandling() else: - # retrieve pickles from a remote host, if configured - if self.remotehost is not None: - collectRemoteTrajectories(remotehost, max_trajs, self.phase1_dir) + # in phase 2, initialise and collect data first as we need the phase1 traj on disk already + self.trajectory_db = None + self.observations_db = None + self.initialiseRemoteDataHandling() - # reload the phase1 trajectories - dt_beg, dt_end = self.loadPhase1Trajectories(max_trajs=max_trajs) + dt_beg, dt_end = self.loadPhase1Trajectories() self.processing_list = None self.dt_range=[dt_beg, dt_end] - self.db = None + + self.candidate_db = None + if mcmode == MCMODE_CANDS: + self.candidate_db = CandidateDatabase(db_dir) + ### Define country groups to speed up the proceessing ### @@ -632,41 +550,65 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode ### ### + def checkRemoteDataMode(self): + remote_cfg = os.path.join(self.db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + self.RemoteDatahandler = RemoteDataHandler(remote_cfg) + return self.RemoteDatahandler.mode + else: + return 'none' + + + def initialiseRemoteDataHandling(self): + # Initialise remote data handling, if the config file is present + remote_cfg = os.path.join(self.db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + log.info('remote data management requested, initialising') + self.RemoteDatahandler = RemoteDataHandler(remote_cfg) + if self.RemoteDatahandler.mode == 'child': + self.RemoteDatahandler.clearStopFlag() + status = self.getRemoteData(verbose=True) + else: + status = self.moveUploadedData(verbose=False) + if not status: + log.info('no remote data yet') + else: + self.RemoteDatahandler = None - def purgePhase1ProcessedData(self, dir_path): - """ Purge old phase1 processed data if it is older than 90 days. """ - - refdt = time.time() - 90*86400 - result = [] - for path, _, files in os.walk(dir_path): - - for file in files: - - file_path = os.path.join(path, file) - - # Check if the file is older than the reference date - try: - file_dt = os.stat(file_path).st_mtime - except FileNotFoundError: - log.warning(f"File not found: {file_path}") - continue - - if ( - os.path.exists(file_path) and (file_dt < refdt) and os.path.isfile(file_path) - ): - - try: - os.remove(file_path) - result.append(file_path) - - except FileNotFoundError: - log.warning(f"File not found: {file_path}") + def purgeProcessedData(self, dir_path, days_back=14, verbose=False): + """ Purge processed candidate or phase1 data if it is older than a default of 14 days. """ - except Exception as e: - log.error(f"Error removing file {file_path}: {e}") - - return result + refdt = time.time() - days_back*86400 + num_removed = 0 + log.info(f'purging processed data from {dir_path} thats older than {days_back} days') + for file_name in glob.glob(os.path.join(dir_path,'*.pickle')): + try: + file_dt = os.stat(file_name).st_mtime + if file_dt < refdt: + if verbose: + log.info(f'removing {file_name}') + os.remove(file_name) + num_removed += 1 + except FileNotFoundError: + log.warning(f"File disappeared: {file_name}") + continue + except Exception as e: + log.error(f"Error removing file {file_name}: {e}") + + return num_removed + + def cleanupPartialProcessing(self): + log.info('checking for partially-processed phase1 files') + i=0 + for i, file_name in enumerate(glob.glob(os.path.join(self.phase1_dir, '*.pickle_processing'))): + new_name = file_name.replace('_processing','') + if os.path.isfile(new_name): + os.remove(file_name) + else: + os.rename(file_name, new_name) + log.info(f'updated {i} partially-processed files') + return def archiveOldRecords(self, older_than=3): """ @@ -682,43 +624,32 @@ def __init__(self, station, obs_id): archdate = datetime.datetime.now(datetime.timezone.utc) - relativedelta(months=older_than) archdate_jd = datetime2JD(archdate) + arch_prefix = archdate.strftime("%Y%m") + + # TODO check if this works + self.observations_db.archiveObsDatabase(self.db_dir, arch_prefix, archdate_jd) + self.trajectory_db.archiveTrajDatabase(self.db_dir, arch_prefix, archdate_jd) + + return + + def closeObservationsDatabase(self): + if self.observations_db: + self.observations_db.closeObsDatabase() + return - arch_db_path = os.path.join(self.db_dir, f'{archdate.strftime("%Y%m")}_{JSON_DB_NAME}') - archdb = DatabaseJSON(arch_db_path, verbose=self.verbose) - log.info(f'Archiving db records to {arch_db_path}...') - - for traj in [t for t in self.db.trajectories if t < archdate_jd]: - if traj < archdate_jd: - archdb.addTrajectory(None, self.db.trajectories[traj], False) - self.db.removeTrajectory(self.db.trajectories[traj], keepFolder=True) - - for traj in [t for t in self.db.failed_trajectories if t < archdate_jd]: - if traj < archdate_jd: - archdb.addTrajectory(None, self.db.failed_trajectories[traj], True) - self.db.removeTrajectory(self.db.failed_trajectories[traj], keepFolder=True) - - for station in self.db.processed_dirs: - arch_processed = [dirname for dirname in self.db.processed_dirs[station] if - datetime.datetime.strptime(dirname[14:22], '%Y%m%d').replace(tzinfo=datetime.timezone.utc) < archdate] - for dirname in arch_processed: - archdb.addProcessedDir(station, dirname) - self.db.processed_dirs[station].remove(dirname) - - for station in self.db.paired_obs: - arch_processed = [obs_id for obs_id in self.db.paired_obs[station] if - datetime.datetime.strptime(obs_id[7:15], '%Y%m%d').replace(tzinfo=datetime.timezone.utc) < archdate] - for obs_id in arch_processed: - archdb.addPairedObservation(DummyMetObs(station, obs_id)) - self.db.paired_obs[station].remove(obs_id) - - archdb.save() - self.db.save() + def closeCandidatesDatabase(self): + if self.candidate_db: + self.candidate_db.closeCandDatabase() + + def closeTrajectoryDatabase(self): + if self.trajectory_db: + self.trajectory_db.closeTrajDatabase() return def loadStations(self): """ Load the station names in the processing folder. """ - station_list = [] + avail_station_list = [] for dir_name in sorted(os.listdir(self.dir_path)): @@ -726,31 +657,23 @@ def loadStations(self): if os.path.isdir(os.path.join(self.dir_path, dir_name)): if re.match("^[A-Z]{2}[A-Z0-9]{4}$", dir_name): log.info("Using station: " + dir_name) - station_list.append(dir_name) + avail_station_list.append(dir_name) else: log.info("Skipping directory: " + dir_name) - return station_list - - + return avail_station_list def findUnprocessedFolders(self, station_list): """ Go through directories and find folders with unprocessed data. """ processing_list = [] - # skipped_dirs = 0 - # Go through all station directories for station_name in station_list: station_path = os.path.join(self.dir_path, station_name) - # Add the station name to the database if it doesn't exist - if station_name not in self.db.processed_dirs: - self.db.processed_dirs[station_name] = [] - # Go through all directories in stations for night_name in os.listdir(station_path): @@ -770,23 +693,10 @@ def findUnprocessedFolders(self, station_list): night_path = os.path.join(station_path, night_name) night_path_rel = os.path.join(station_name, night_name) - # # If the night path is not in the processed list, add it to the processing list - # if night_path_rel not in self.db.processed_dirs[station_name]: - # processing_list.append([station_name, night_path_rel, night_path, night_dt]) - processing_list.append([station_name, night_path_rel, night_path, night_dt]) - # else: - # skipped_dirs += 1 - - - # if skipped_dirs: - # log.info("Skipped {:d} processed directories".format(skipped_dirs)) - return processing_list - - def initMeteorObs(self, station_code, ftpdetectinfo_path, platepars_recalibrated_dict): """ Init meteor observations from the FTPdetectinfo file and recalibrated platepars. """ @@ -806,8 +716,6 @@ def initMeteorObs(self, station_code, ftpdetectinfo_path, platepars_recalibrated return meteor_list - - def loadUnpairedObservations(self, processing_list, dt_range=None): """ Load unpaired meteor observations, i.e. observations that are not a part of any trajectory. """ @@ -815,17 +723,20 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): unpaired_met_obs_list = [] prev_station = None station_count = 1 + for station_code, rel_proc_path, proc_path, night_dt in processing_list: # Check that the night datetime is within the given range of times, if the range is given if (dt_range is not None) and (night_dt is not None): dt_beg, dt_end = dt_range - # Skip all folders which are outside the limits - if (night_dt < dt_beg) or (night_dt > dt_end): + # Skip all folders which are outside the limits + # allow a day before dt_beg to capture data overlapping from an earlier timezone + if (night_dt < dt_beg + datetime.timedelta(days=-1)) or (night_dt > dt_end): continue - + log.info("") + log.info(f"Processing station: {station_code} {rel_proc_path}") ftpdetectinfo_name = None platepar_recalibrated_name = None @@ -834,11 +745,15 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): if os.path.isfile(proc_path): continue - log.info("") - log.info("Processing station: " + station_code) + # Find FTPdetectinfo and platepar files and skip if they're not both present + file_list = os.listdir(proc_path) + joined_file_list = ' '.join(file_list) + + if 'FTPdetectinfo' not in joined_file_list or 'platepars_all_recalibrated.json' not in joined_file_list: + continue - # Find FTPdetectinfo and platepar files - for name in os.listdir(proc_path): + # okay, we at least have the required files, lets try loading them + for name in file_list: # Find FTPdetectinfo if name.startswith("FTPdetectinfo") and name.endswith('.txt') and \ @@ -858,25 +773,37 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): except: pass - + # Skip these observations if no data files were found inside if (ftpdetectinfo_name is None) or (platepar_recalibrated_name is None): - log.info(" Skipping {:s} due to missing data files...".format(rel_proc_path)) - - # Add the folder to the list of processed folders - self.db.addProcessedDir(station_code, rel_proc_path) + log.info(f" Skipping {rel_proc_path} due to missing data files...") + continue + if len(platepars_recalibrated_dict) == 0: + #log.info(f" Skipping {rel_proc_path} due to no observations...") continue + # More accurate check that the night datetime is within the given range of times, if the range is given + if (dt_range is not None) and (night_dt is not None): + dt_beg, dt_end = dt_range + + # Find the time of the latest detection in this dataset. + # We add on 10s to the time of the latest detection to allow for the duration of an RMS FF block + + latest_obs = datetime.datetime.strptime(list(platepars_recalibrated_dict.keys())[-1][10:25], '%Y%m%d_%H%M%S') + latest_obs = latest_obs.replace(tzinfo=datetime.timezone.utc) + datetime.timedelta(seconds=10) + + # Filter out any folders whose start-date is after the bucket end date, or whose + # latest observaton is before the bucket start date. + + if (latest_obs < dt_beg) or (night_dt > dt_end): + # log.info(f'skipping {rel_proc_path} as no relevant observations') + continue + if station_code != prev_station: station_count += 1 prev_station = station_code - # Save database to mark those with missing data files (only every 250th station, to speed things up) - if (station_count % 250 == 0) and (station_code != prev_station): - self.saveDatabase() - - # Load platepars with open(os.path.join(proc_path, platepar_recalibrated_name)) as f: platepars_recalibrated_dict = json.load(f) @@ -889,6 +816,12 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): added_count = 0 for cams_met_obs in cams_met_obs_list: + obs_dt = jd2Date(cams_met_obs.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc) + + if dt_range and (obs_dt < dt_beg or obs_dt > dt_end): + #log.info(f'skipping {cams_met_obs.ff_name} as outside current bucket') + continue + # Get the platepar if cams_met_obs.ff_name in platepars_recalibrated_dict: pp_dict = platepars_recalibrated_dict[cams_met_obs.ff_name] @@ -923,7 +856,7 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): # Init the new meteor observation object met_obs = MeteorObsRMS( station_code, - jd2Date(cams_met_obs.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc), + obs_dt, pp, meteor_data, rel_proc_path, @@ -934,11 +867,9 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): continue # Add only unpaired observations - if not self.db.checkObsIfPaired(met_obs): - + if not self.checkIfObsPaired(met_obs.id, verbose=verbose): # print(" ", station_code, met_obs.reference_dt, rel_proc_path) added_count += 1 - unpaired_met_obs_list.append(met_obs) log.info(" Added {:d} observations!".format(added_count)) @@ -946,10 +877,8 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): log.info("") log.info(" Finished loading unpaired observations!") - self.saveDatabase() return unpaired_met_obs_list - def yearMonthDayDirInDtRange(self, dir_name): """ Given a directory name which is either YYYY, YYYYMM or YYYYMMDD, check if it is in the given @@ -971,21 +900,21 @@ def yearMonthDayDirInDtRange(self, dir_name): date_fmt = "%Y" # Check if the directory name starts with a year - if not re.match("^\d{4}", dir_name): # noqa: W605 + if not re.match(r"^\d{4}", dir_name): return False elif len(dir_name) == 6: date_fmt = "%Y%m" # Check if the directory name starts with a year and month - if not re.match("^\d{6}", dir_name): # noqa: W605 + if not re.match(r"^\d{6}", dir_name): return False elif len(dir_name) == 8: date_fmt = "%Y%m%d" # Check if the directory name starts with a year, month and day - if not re.match("^\d{8}", dir_name): # noqa: W605 + if not re.match(r"^\d{8}", dir_name): return False else: @@ -1039,8 +968,7 @@ def yearMonthDayDirInDtRange(self, dir_name): return True else: - return False - + return False def trajectoryFileInDtRange(self, file_name, dt_range=None): """ Check if the trajectory file is in the given datetime range. """ @@ -1069,108 +997,128 @@ def trajectoryFileInDtRange(self, file_name, dt_range=None): else: return False + def updateTrajectoryDatabase(self, dt_range=None): + """ + Update the trajectory database to make sure its in line with whats on disk, + at the same time checking for and removing any duplicate trajectories. - def removeDeletedTrajectories(self): - """ Purge the database of any trajectories that no longer exist on disk. - These can arise because the monte-carlo stage may update the data. + Arguments: + dt_range: [datetime, datetime] range of dates to load data for """ - if not os.path.isdir(self.output_dir): return - if self.db is None: + if self.trajectory_db is None: return - log.info(" Removing deleted trajectories from: " + self.output_dir) + if dt_range is None: + dt_beg, dt_end = self.dt_range + else: + dt_beg, dt_end = dt_range + + log.info("Updating trajectory database...") if self.dt_range is not None: - log.info(" Datetime range: {:s} - {:s}".format( - self.dt_range[0].strftime("%Y-%m-%d %H:%M:%S"), - self.dt_range[1].strftime("%Y-%m-%d %H:%M:%S"))) + log.info(f" Datetime range: {dt_beg.strftime('%Y-%m-%d %H:%M:%S')} - {dt_end.strftime('%Y-%m-%d %H:%M:%S')}") + log.info(" Removing deleted trajectories from: " + self.output_dir) - jdt_start = datetime2JD(self.dt_range[0]) - jdt_end = datetime2JD(self.dt_range[1]) - - trajs_to_remove = [] + jdt_range = [datetime2JD(dt_beg), datetime2JD(dt_end)] + + log.info(" Removing deleted trajectories...") + traj_list = self.trajectory_db.getTrajBasics(self.output_dir, jdt_range) + i = 0 + for traj in traj_list: + if not os.path.isfile(os.path.join(self.output_dir, traj['traj_file_path'])): + if verbose: + log.info(f' removing traj {jd2Date(traj["jdt_ref"],dt_obj=True).strftime("%Y%m%d_%H%M%S.%f")} {traj["traj_file_path"]} from database') + self.removeTrajectory(TrajectoryReduced(None, json_dict=traj)) + i += 1 + log.info(f' removed {i} deleted trajectories') + + # + # Now look for duplicate trajectories and ones with shared observations. In theory these should not exist + # but its possible for them to arise because during distributed calculations, candidates may be found before + # the last solver run has completed. + # + # lambda to use to find traj with common obs + def atleastOneObs(obs_ids,next_obs_ids): + if obs_ids is None or next_obs_ids is None: + return False + if len(obs_ids)==0 or len(next_obs_ids)==0: + return False + if type(obs_ids[0])==int or type(next_obs_ids[0])==int: + return False + return any(i in next_obs_ids for i in obs_ids) - keys = [k for k in self.db.trajectories.keys() if k >= jdt_start and k <= jdt_end] - for trajkey in keys: - traj_reduced = self.db.trajectories[trajkey] - # Update the trajectory path to make sure we're working with the correct filesystem - traj_path = self.generateTrajOutputDirectoryPath(traj_reduced) - traj_file_name = os.path.split(traj_reduced.traj_file_path)[1] - traj_path = os.path.join(traj_path, traj_file_name) + log.info(" Looking for duplicate trajectories...") + # create a dataframe and sort it by date. Duplicates will almost always have very similar dates + traj_df = pd.DataFrame(traj_list) + # remove legacy trajs without obs_ids + if 'obs_ids' in traj_df.columns: + traj_df = traj_df[traj_df.obs_ids != "None"] + if len(traj_df) > 0: - if self.verbose: - log.info(f' testing {traj_path}') + # sort by date + traj_df.sort_values(by='jdt_ref', inplace=True, ignore_index=True) - if not os.path.isfile(traj_path): - traj_reduced.traj_file_path = traj_path - trajs_to_remove.append(traj_reduced) + # add a column containing the next trajectory's observations + traj_df['obs_ids_next'] = traj_df.obs_ids.shift(-1) - for traj in trajs_to_remove: - log.info(f' removing deleted {traj.traj_file_path}') + # get a list of any trajectories with exactly the same observations. + # Then iterate over the list, removing whichever one isn't on disk. + + same_obs = traj_df.query('obs_ids == obs_ids_next') + for idx,rw in same_obs.iterrows(): + traj1 = loadPickle(*os.path.split(traj_df.iloc[idx].traj_file_path)) + + if traj1.traj_id == rw.traj_id: + log.info(f'removing duplicate trajectory {traj_df.iloc[idx+1].traj_id}') + self.trajectory_db.removeTrajectoryById(traj_df.iloc[idx+1].traj_id) + else: + log.info(f'removing duplicate trajectory {traj_df.iloc[idx].traj_id}') + self.trajectory_db.removeTrajectoryById(traj_df.iloc[idx].traj_id) + + # get a list of trajectories which share at least one observation. + # These are candidates for being merged, so in auto-mode, we can unpair all the obs and + # delete the traj, then let the candidate finder reanalyse on its next pass. + # in non-Auto mode, where there's only one pass, we'll have to leave both traj and let the user decide - # remove from the database but not from the disk: they're already not on the disk and this avoids - # accidentally deleting a different traj with a timestamp which is within a millisecond - self.db.removeTrajectory(traj, keepFolder=True) + traj_df['overlapstats'] = traj_df.apply(lambda row: atleastOneObs(row.obs_ids, row.obs_ids_next), axis=1) + common_obs = traj_df[traj_df.overlapstats] - return + for idx,rw in common_obs.iterrows(): + log.info(f'unpairing obs linked to mergeable events {traj_df.iloc[idx].traj_id} and {traj_df.iloc[idx+1].traj_id}') - def loadComputedTrajectories(self, traj_dir_path, dt_range=None): - """ Load already estimated trajectories from disk within a date range. + traj1 = traj_df.iloc[idx] + traj2 = traj_df.iloc[idx+1] + combined_obs_ids = list(set(traj1.obs_ids + traj2.obs_ids + traj1.ign_obs_ids + traj2.ign_obs_ids)) + self.observations_db.unpairObs(combined_obs_ids) + if self.auto_mode: + self.trajectory_db.removeTrajectoryById(traj_df.iloc[idx].traj_id) + self.trajectory_db.removeTrajectoryById(traj_df.iloc[idx+1].traj_id) - Arguments: - traj_dir_path: [str] Full path to a directory with trajectory pickles. - """ + # Finally, scan the disk for trajectories that need to be added to the database. + # These can arise during distributed processing or phase2 analysis if the jdt_ref changes significantly. + traj_dir_path = os.path.join(self.output_dir, OUTPUT_TRAJ_DIR) # defend against the case where there are no existing trajectories and traj_dir_path doesn't exist - if not os.path.isdir(traj_dir_path): - return - - if self.db is None: - return - if dt_range is None: - dt_beg, dt_end = self.dt_range - else: - dt_beg, dt_end = dt_range - - log.info(" Loading trajectories from: " + traj_dir_path) - if self.dt_range is not None: - log.info(" Datetime range: {:s} - {:s}".format( - dt_beg.strftime("%Y-%m-%d %H:%M:%S"), - dt_end.strftime("%Y-%m-%d %H:%M:%S"))) - + log.info(" Adding found trajectories from: " + traj_dir_path) counter = 0 # Construct a list of all ddirectory paths to visit. The trajectory directories are sorted in # YYYY/YYYYMM/YYYYMMDD, so visit them in that order to check if they are in the datetime range dir_paths = [] - #iterate over the days in the range - jdt_beg = int(np.floor(datetime2JD(dt_beg))) - jdt_end = int(np.ceil(datetime2JD(dt_end))) - - yyyy = 0 - mm = 0 - dd = 0 start_time = datetime.datetime.now() - for jdt in range(jdt_beg, jdt_end + 1): - curr_dt = jd2Date(jdt, dt_obj=True) - if curr_dt.year != yyyy: - yyyy = curr_dt.year - log.info("- year " + str(yyyy)) - - if curr_dt.month != mm: - mm = curr_dt.month - yyyymm = f'{yyyy}{mm:02d}' - log.info(" - month " + str(yyyymm)) + #iterate over the days in the range + dt_diff = max((dt_end - dt_beg).days, 1) + 2 - if curr_dt.day != dd: - dd = curr_dt.day - yyyymmdd = f'{yyyy}{mm:02d}{dd:02d}' - log.info(" - day " + str(yyyymmdd)) + for d in range(dt_diff): + curr_dt = dt_beg + datetime.timedelta(days=d) + yyyy = curr_dt.year + yyyymm = f'{yyyy}{curr_dt.month:02d}' + yyyymmdd = f'{yyyy}{curr_dt.month:02d}{curr_dt.day:02d}' yyyymmdd_dir_path = os.path.join(traj_dir_path, f'{yyyy}', f'{yyyymm}', f'{yyyymmdd}') @@ -1187,105 +1135,36 @@ def loadComputedTrajectories(self, traj_dir_path, dt_range=None): if self.trajectoryFileInDtRange(file_name, dt_range=dt_range): - self.db.addTrajectory(os.path.join(full_traj_dir, file_name)) + self.trajectory_db.addTrajectory(TrajectoryReduced(os.path.join(full_traj_dir, file_name)), force_add=False) # Print every 1000th trajectory if counter % 1000 == 0: - log.info(f" Loaded {counter:6d} trajectories, currently on {file_name}") + log.info(f" Loaded {counter:6d} trajectories") counter += 1 dir_paths.append(full_traj_dir) dur = (datetime.datetime.now() - start_time).total_seconds() - log.info(f" Loaded {counter:6d} trajectories in {dur:.0f} seconds") - - + log.info(f" Added {counter:6d} trajectories in {dur:.0f} seconds") def getComputedTrajectories(self, jd_beg, jd_end): """ Returns a list of computed trajectories between the Julian dates. """ - - return [self.db.trajectories[key] for key in self.db.trajectories - if (self.db.trajectories[key].jdt_ref >= jd_beg) - and (self.db.trajectories[key].jdt_ref <= jd_end)] - - - def removeDuplicateTrajectories(self, dt_range): - """ Remove trajectories with duplicate IDs - keeping the one with the most station observations - """ - - log.info('removing duplicate trajectories') - - tr_in_scope = self.getComputedTrajectories(datetime2JD(dt_range[0]), datetime2JD(dt_range[1])) - tr_to_check = [{'jdt_ref':traj.jdt_ref,'traj_id':traj.traj_id, 'traj': traj} for traj in tr_in_scope if hasattr(traj,'traj_id')] - - if len(tr_to_check) == 0: - log.info('no trajectories in range') - return - - tr_df = pd.DataFrame(tr_to_check) - tr_df['dupe']=tr_df.duplicated(subset=['traj_id']) - dupeids = tr_df[tr_df.dupe].sort_values(by=['traj_id']).traj_id - duperows = tr_df[tr_df.traj_id.isin(dupeids)] - - log.info(f'there are {len(duperows.traj_id.unique())} duplicate trajectories') - - # iterate over the duplicates, finding the best and removing the others - for traj_id in duperows.traj_id.unique(): - num_stats = 0 - best_traj_dt = None - best_traj_path = None - # find duplicate with largest number of observations - for testdt in duperows[duperows.traj_id==traj_id].jdt_ref.values: - - if len(dh.db.trajectories[testdt].participating_stations) > num_stats: - - best_traj_dt = testdt - num_stats = len(dh.db.trajectories[testdt].participating_stations) - # sometimes the database contains duplicates that differ by microseconds in jdt. These - # will have overwritten each other in the folder so make a note of the location. - best_traj_path = dh.db.trajectories[testdt].traj_file_path - - # now remove all except the best - for testdt in duperows[duperows.traj_id==traj_id].jdt_ref.values: - - traj = dh.db.trajectories[testdt] - if testdt != best_traj_dt: - - # get the current trajectory's location. If its the same as that of the best trajectory - # don't try to delete the solution from disk even if there's a small difference in jdt_ref - keepFolder = False - if traj.traj_file_path == best_traj_path: - keepFolder = True - # Update the trajectory path to make sure we're working with the correct filesystem - traj_path = self.generateTrajOutputDirectoryPath(traj) - traj_file_name = os.path.split(traj.traj_file_path)[1] - traj.traj_file_path = os.path.join(traj_path, traj_file_name) - log.info(f'removing duplicate {traj.traj_id} keep {traj_file_name} {keepFolder}') - - self.db.removeTrajectory(traj, keepFolder=keepFolder) - - else: - if self.verbose: - log.info(f'keeping {traj.traj_id} {traj.traj_file_path}') - - return - + jd_range = [jd_beg, jd_end] + json_dicts = self.trajectory_db.getTrajectories(self.output_dir, jd_range) + trajs = [TrajectoryReduced(None, json_dict=j) for j in json_dicts] + return trajs def getPlatepar(self, met_obs): """ Return the platepar of the meteor observation. """ return met_obs.platepar - - def getUnpairedObservations(self): """ Returns a list of unpaired meteor observations. """ return self.unpaired_observations - def countryFilter(self, station_code1, station_code2): """ Only pair observations if they are in proximity to a given country. """ @@ -1300,9 +1179,30 @@ def countryFilter(self, station_code1, station_code2): # If a given country is not in any of the groups, allow it to be paired return True + + def checkIfObsPaired(self, obs_id, verbose=False): + return self.observations_db.checkObsPaired(obs_id, verbose) + + def addPairedObs(self, matched_obs, jdt_ref, verbose=False): + """ + mark a list of observations as paired + + parameters: + matched_obs : a tuple containing the observations. + jdt_ref : the julian date of the Trajectory they are paired with. + + """ + if len(matched_obs[0])==3: + obs_ids = [met_obs.id for _, met_obs, _ in matched_obs] + else: + obs_ids = [met_obs.id for _, met_obs in matched_obs] + jdt_refs = [jdt_ref] * len(obs_ids) + self.observations_db.addPairedObservations(obs_ids, jdt_refs, verbose=verbose) - def findTimePairs(self, met_obs, unpaired_observations, max_toffset): + return + + def findTimePairs(self, met_obs, unpaired_observations, max_toffset, verbose=False): """ Finds pairs in time between the given meteor observations and all other observations from different stations. @@ -1322,6 +1222,9 @@ def findTimePairs(self, met_obs, unpaired_observations, max_toffset): # Go through all meteors from other stations for met_obs2 in unpaired_observations: + if self.checkIfObsPaired(met_obs2.id, verbose=verbose): + continue + # Take only observations from different stations if met_obs.station_code == met_obs2.station_code: continue @@ -1337,7 +1240,6 @@ def findTimePairs(self, met_obs, unpaired_observations, max_toffset): return found_pairs - def getTrajTimePairs(self, traj_reduced, unpaired_observations, max_toffset): """ Find unpaired observations which are close in time to the given trajectory. """ @@ -1366,9 +1268,9 @@ def getTrajTimePairs(self, traj_reduced, unpaired_observations, max_toffset): return found_traj_obs_pairs - def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): - """ Generate a path to the trajectory output directory. + """ + Generate a path to the trajectory output directory. Keyword arguments: make_dirs: [bool] Make the tree of output directories. False by default. @@ -1377,11 +1279,11 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): # Generate a list of station codes if isinstance(traj, TrajectoryReduced): # If the reducted trajectory object is given - station_list = traj.participating_stations + traj_station_list = traj.participating_stations else: # If the full trajectory object is given - station_list = [obs.station_id for obs in traj.observations if obs.ignore_station is False] + traj_station_list = [obs.station_id for obs in traj.observations if obs.ignore_station is False] # Datetime of the reference trajectory time @@ -1399,7 +1301,7 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): # Name of the trajectory directory # sort the list of country codes otherwise we can end up with duplicate trajectories - ctry_list = list(set([stat_id[:2] for stat_id in station_list])) + ctry_list = list(set([stat_id[:2] for stat_id in traj_station_list])) ctry_list.sort() traj_dir = dt.strftime("%Y%m%d_%H%M%S.%f")[:-3] + "_" + "_".join(ctry_list) @@ -1411,9 +1313,15 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): return out_path + def saveTrajectoryResults(self, traj, save_plots, save_phase1=False, verbose=False): + """ + Save trajectory results to the disk. - def saveTrajectoryResults(self, traj, save_plots): - """ Save trajectory results to the disk. """ + Parameters: + traj: [traj] the trajectory to save + save_plots: [bool] true if we also want to generate plots of the data + save_phase1:[bool] true if we also want to save a phase1 copy of the traj + """ # Generate the name for the output directory (add list of country codes at the end) @@ -1427,7 +1335,7 @@ def saveTrajectoryResults(self, traj, save_plots): # if additional observations are found then the refdt or country list may change quite a bit traj.longname = os.path.split(output_dir)[-1] - if self.mc_mode == 1: + if self.mc_mode & MCMODE_PHASE1: # The MC phase may change the refdt so save a copy of the the original name. traj.pre_mc_longname = traj.longname @@ -1438,17 +1346,13 @@ def saveTrajectoryResults(self, traj, save_plots): savePickle(traj, output_dir, traj.file_name + '_trajectory.pickle') log.info(f'saved {traj.traj_id} to {output_dir}') - if self.mc_mode == 1: - savePickle(traj, self.phase1_dir, traj.pre_mc_longname + '_trajectory.pickle') - elif self.mc_mode == 2: - # we save this in MC mode the MC phase may alter the trajectory details and if later on + if (self.mc_mode & MCMODE_PHASE1 and not self.mc_mode & MCMODE_PHASE2) or save_phase1: + self.saveCandOrTraj(traj, f'{traj.longname}_trajectory.pickle', verbose=verbose) + + elif self.mc_mode & MCMODE_PHASE2: + # the MC phase may alter the trajectory details and if later on # we're including additional observations we need to use the most recent version of the trajectory - savePickle(traj, os.path.join(self.phase1_dir, 'processed'), traj.pre_mc_longname + '_trajectory.pickle') - - if self.remotehost is not None: - log.info('saving to remote host') - uploadTrajToRemote(remotehost, traj.file_name + '_trajectory.pickle', output_dir) - log.info(' ...done') + savePickle(traj, os.path.join(self.phase1_dir, 'processed'), f'{traj.pre_mc_longname}_trajectory.pickle') # Save the plots if save_plots: @@ -1459,35 +1363,16 @@ def saveTrajectoryResults(self, traj, save_plots): pass traj.save_results = False - - - def markObservationAsProcessed(self, met_obs): - """ Mark the given meteor observation as processed. """ - - if self.db is None: - return - self.db.addProcessedDir(met_obs.station_code, met_obs.rel_proc_path) - - - - def markObservationAsPaired(self, met_obs): - """ Mark the given meteor observation as paired in a trajectory. """ - - if self.db is None: - return - self.db.addPairedObservation(met_obs) - - - - def addTrajectory(self, traj, failed_jdt_ref=None): - """ Add the resulting trajectory to the database. + def addTrajectory(self, traj, failed_jdt_ref=None, verbose=False): + """ + Add the resulting trajectory to the database. Arguments: traj: [Trajectory object] failed_jdt_ref: [float] Reference Julian date of the failed trajectory. None by default. """ - if self.db is None: + if self.trajectory_db is None: return # Set the correct output path traj.output_dir = self.generateTrajOutputDirectoryPath(traj) @@ -1500,15 +1385,15 @@ def addTrajectory(self, traj, failed_jdt_ref=None): if failed_jdt_ref is not None: traj_reduced.jdt_ref = failed_jdt_ref - self.db.addTrajectory(None, traj_obj=traj_reduced, failed=(failed_jdt_ref is not None)) + self.trajectory_db.addTrajectory(traj_reduced, failed=(failed_jdt_ref is not None), verbose=verbose) - - - def removeTrajectory(self, traj_reduced): - """ Remove the trajectory from the data base and disk. """ + def removeTrajectory(self, traj_reduced, remove_phase1=False): + """ + Remove the trajectory from the data base and disk. + """ # in mcmode 2 the database isn't loaded but we still need to delete updated trajectories - if self.mc_mode == 2: + if self.mc_mode & MCMODE_PHASE2: if os.path.isfile(traj_reduced.traj_file_path): traj_dir = os.path.dirname(traj_reduced.traj_file_path) shutil.rmtree(traj_dir, ignore_errors=True) @@ -1518,53 +1403,53 @@ def removeTrajectory(self, traj_reduced): traj_dir = os.path.join(base_dir, traj_reduced.pre_mc_longname) if os.path.isdir(traj_dir): shutil.rmtree(traj_dir, ignore_errors=True) - else: - log.warning(f'unable to find {traj_dir}') - else: - log.warning(f'unable to find {traj_reduced.traj_file_path}') + return - # remove the processed pickle now we're done with it - self.cleanupPhase2TempPickle(traj_reduced, True) + if (self.mc_mode & MCMODE_PHASE1 or self.mc_mode & MCMODE_CANDS) and remove_phase1: + # remove any solution from the phase1 folder + phase1_traj = os.path.join(self.phase1_dir, traj_reduced.pre_mc_longname + '_trajectory.pickle') + if os.path.isfile(phase1_traj): + try: + os.remove(phase1_traj) + log.info(f'removed {phase1_traj}') + except Exception: + pass - return - self.db.removeTrajectory(traj_reduced) + # Remove the trajectory folder from the disk + if os.path.isfile(traj_reduced.traj_file_path): + traj_dir = os.path.dirname(traj_reduced.traj_file_path) + shutil.rmtree(traj_dir, ignore_errors=True) + if os.path.isfile(traj_reduced.traj_file_path): + log.warning(f'unable to remove {traj_dir}') + self.trajectory_db.removeTrajectory(traj_reduced) - def cleanupPhase2TempPickle(self, traj, success=False): - """ - At the start of phase 2 monte-carlo sim calculation, the phase1 pickles are renamed to indicate they're being processed. - Once each one is processed (fail or succeed) we need to clean up the file. If the MC step failed, we still want to keep - the pickle, because we might later on get new data and it might become solvable. Otherwise, we can just delete the file - since the MC solver will have saved an updated one already. + def checkCandIfFailed(self, candidate): + """ + Check if the given candidate has been processed with the same observations and has failed to be + computed before. """ - if self.mc_mode != 2: - return - fldr_name = os.path.split(self.generateTrajOutputDirectoryPath(traj, make_dirs=False))[-1] - pick = os.path.join(self.phase1_dir, fldr_name + '_trajectory.pickle_processing') - if os.path.isfile(pick): - os.remove(pick) - else: - log.warning(f'unable to find _processing file {pick}') - if not success: - # save the pickle in case we get new data later and can solve it - savePickle(traj, os.path.join(self.phase1_dir, 'processed'), fldr_name + '_trajectory.pickle') - return - + jdt_ref = min([obs.jdt_ref for obs, _, _ in candidate]) + stations = [obs.station_id for obs, _, _ in candidate] + return self.trajectory_db.checkCandIfFailed(jdt_ref, stations) def checkTrajIfFailed(self, traj): - """ Check if the given trajectory has been computed with the same observations and has failed to be - computed before. - - """ + """ + Check if the given trajectory has been computed with the same observations and has failed to be + computed before. - if self.db is None: - return - return self.db.checkTrajIfFailed(traj) + Parameters: + traj: full trajectory object + """ + if self.trajectory_db is None: + return + traj_reduced = TrajectoryReduced(None, traj_obj=traj) + return self.trajectory_db.checkTrajIfFailed(traj_reduced) def loadFullTraj(self, traj_reduced): - """ Load the full trajectory object. + """ Load the full trajectory object corresponding to a traj_reduced object. Arguments: traj_reduced: [TrajectoryReduced object] @@ -1604,15 +1489,11 @@ def loadFullTraj(self, traj_reduced): return None - def loadPhase1Trajectories(self, max_trajs=1000): + def loadPhase1Trajectories(self): """ Load trajectories calculated by the intersecting-planes phase 1. These trajectories do not include uncertainties which are calculated in the Monte-Carlo phase 2 - keyword arguments: - maxtrajs: [int] maximum number of trajectories to load in each pass, to avoid taking too long per pass. - - returns: dt_beg, dt_end: [datetime] The earliest and latest date/time of the loaded trajectories. Used later to set the number of time buckets to process data in. @@ -1620,7 +1501,7 @@ def loadPhase1Trajectories(self, max_trajs=1000): """ pickles = glob.glob1(self.phase1_dir, "*_trajectory.pickle") pickles.sort() - pickles = pickles[:max_trajs] + pickles = pickles[:self.max_trajs] self.phase1Trajectories = [] if len(pickles) == 0: return None, None @@ -1645,12 +1526,12 @@ def loadPhase1Trajectories(self, max_trajs=1000): if not hasattr(traj, 'pre_mc_longname'): traj.pre_mc_longname = os.path.split(traj_dir)[-1] - # Check if the traj object as fixed time offsets + # Check if the traj object has fixed time offsets if not hasattr(traj, 'fixed_time_offsets'): traj.fixed_time_offsets = {} - # now we've loaded the phase 1 solution, move it to prevent accidental reprocessing - procfile = os.path.join(self.phase1_dir, pick + '_processing') + # now we've loaded the phase 1 solution, move it to prevent reprocessing + procfile = os.path.join(self.phase1_dir, 'processed', pick) if os.path.isfile(procfile): os.remove(procfile) os.rename(os.path.join(self.phase1_dir, pick), procfile) @@ -1662,29 +1543,318 @@ def loadPhase1Trajectories(self, max_trajs=1000): log.info(f'File {pick} skipped for now') return dt_beg, dt_end + def loadCandidates(self, verbose=False): + """ + Load candidates from the 'candidates' folder and then move the file to the 'candidates/processed' folder + Used only in phase1 solving mode + """ + candidate_trajectories = [] + save_path = self.candidate_dir + procpath = os.path.join(save_path, 'processed') + os.makedirs(procpath, exist_ok=True) + + for fil in os.listdir(save_path)[:self.max_trajs]: + if '.pickle' not in fil: + continue + try: + loadedpickle = loadPickle(save_path, fil) + candidate_trajectories.append(loadedpickle) + + # now move the loaded file so we don't try to reprocess it + full_name = os.path.join(save_path, fil) + procfile = os.path.join(procpath, fil) + shutil.copy(full_name, procfile) + os.remove(full_name) + + except Exception: + log.info(f'Candidate {fil} went away, probably picked up by another process') + log.info("-----------------------") + log.info('LOADED {} CANDIDATES'.format(len(candidate_trajectories))) + log.info("-----------------------") + + return candidate_trajectories + + def moveUploadedData(self, verbose=False): + """ + Used in 'master' mode: this moves uploaded data to the target locations on the server + and merges in any uploaded sqlite databases - def saveDatabase(self): - """ Save the data base. """ + """ + log.info('merging in any remotely processed data') + for node in self.RemoteDatahandler.nodes: + if node.nodename == 'localhost' or self.observations_db is None or self.trajectory_db is None: + continue - def _breakHandler(signum, frame): - """ Do nothing if CTRL + C is pressed. """ - log.info("The data base is being saved, the program cannot be exited right now!") - pass + # if the remote node upload path doesn't exist skip it + if not os.path.isdir(os.path.join(node.dirpath,'files')): + continue - if self.db is None: - return - # Prevent quitting while a data base is being saved - original_signal = signal.getsignal(signal.SIGINT) - signal.signal(signal.SIGINT, _breakHandler) + # merge the databases + for obsdb_path in glob.glob(os.path.join(node.dirpath,'files','observations*.db')): + if self.observations_db.mergeObsDatabase(obsdb_path): + os.remove(obsdb_path) + try: + os.remove(f'{obsdb_path}-wal') + os.remove(f'{obsdb_path}-shm') + except Exception: + log.warning(f'unable to fully merge the remote obs database {obsdb_path}') + pass + + + for trajdb_path in glob.glob(os.path.join(node.dirpath,'files','trajectories*.db')): + if self.trajectory_db.mergeTrajDatabase(trajdb_path): + os.remove(trajdb_path) + else: + log.warning(f'unable to fully merge the remote traj database {trajdb_path}') + + i = 0 + remote_trajdir = os.path.join(node.dirpath, 'files', 'trajectories') + if os.path.isdir(remote_trajdir): + for i,traj in enumerate(os.listdir(remote_trajdir)): + if os.path.isdir(os.path.join(remote_trajdir, traj)): + targ_path = os.path.join(self.output_dir, 'trajectories', traj[:4], traj[:6], traj[:8], traj) + src_path = os.path.join(node.dirpath,'files', 'trajectories', traj) + for src_name in os.listdir(src_path): + src_name = os.path.join(src_path, src_name) + if not os.path.isfile(src_name): + log.warning(f'{src_name} missing') + else: + os.makedirs(targ_path, exist_ok=True) + shutil.copy(src_name, targ_path) + shutil.rmtree(src_path,ignore_errors=True) + if i > 0: + log.info(f'moved {i+1} trajectories') + + # if the node was in mode 1 then move any uploaded phase1 solutions + remote_ph1dir = os.path.join(node.dirpath, 'files', 'phase1') + if os.path.isdir(remote_ph1dir) and node.mode==1: + os.makedirs(self.phase1_dir, exist_ok=True) + i = 0 + for i, fil in enumerate([x for x in os.listdir(remote_ph1dir) if '.pickle' in x]): + full_name = os.path.join(remote_ph1dir, fil) + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) + + if i > 0: + log.info(f'moved {i+1} phase 1 solutions from {node.nodename}') + + # if the node was in mode 1 then move any uploaded processed candidates + remote_canddir = os.path.join(node.dirpath, 'files', 'candidates', 'processed') + if os.path.isdir(remote_canddir) and node.mode==1: + i = 0 + targ_dir = os.path.join(self.candidate_dir, 'processed') + for i, fil in enumerate([x for x in os.listdir(remote_canddir) if '.pickle' in x]): + full_name = os.path.join(remote_canddir, fil) + shutil.copy(full_name, targ_dir) + os.remove(full_name) + + if i > 0: + log.info(f'moved {i+1} processed candidates from {node.nodename}') + + return True + + def checkAndRedistribCands(self, wait_time=6, verbose=False): + """ + Check child nodes and + 1) if the stop flag has appeared, move any pending data to prevent it getting stuck + 2) move data if it has been waiting more than wait_time hours, default six + 3) if the node is idle, assign it extra data + + Parameters: + wait_time : time in hours to wait before data is considered stale + + """ + for node in self.RemoteDatahandler.nodes: + if node.nodename == 'localhost' or self.observations_db is None or self.trajectory_db is None: + continue + # if the remote node upload path doesn't exist skip it + if not os.path.isdir(os.path.join(node.dirpath,'files')): + continue - # Save the data base - log.info("Saving data base to disk...") - self.db.save() + # if the stop file has appeared, then move any pending candidates or phase1 files + + if os.path.isfile(os.path.join(node.dirpath, 'files','stop')): + files_to_move = glob.glob(os.path.join(node.dirpath, 'files', 'candidates', '*.pickle')) + if len(files_to_move) > 0: + log.info(f'{node.nodename} stopfile has appeared, moving candidates') + for full_name in files_to_move: + shutil.copy(full_name, self.candidate_dir) + os.remove(full_name) + files_to_move = glob.glob(os.path.join(node.dirpath, 'files', 'phase1', '*.pickle')) + if len(files_to_move) > 0: + log.info(f'{node.nodename} stopfile has appeared, moving phase1 files') + for full_name in files_to_move: + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) + else: + # if the stop file isn't present and the nodes are idle, give them something to do + + targ_dir = os.path.join(node.dirpath, 'files', 'candidates') + if len(glob.glob(os.path.join(targ_dir, '*.pickle'))) == 0 and node.mode == MCMODE_PHASE1: + # the node is waiting for data + log.info(f'{node.nodename} idle, giving it extra candidates') + i = 0 + for i, full_name in enumerate(glob.glob(os.path.join(self.candidate_dir, '*.pickle'))): + shutil.copy(full_name, targ_dir) + os.remove(full_name) + i +=1 + if i == node.capacity: + break + + targ_dir = os.path.join(node.dirpath, 'files', 'phase1') + if len(glob.glob(os.path.join(targ_dir, '*.pickle'))) == 0 and node.mode == MCMODE_PHASE2: + # the node is waiting for data + log.info(f'{node.nodename} idle, giving it extra phase1 data') + i = 0 + for i, full_name in enumerate(glob.glob(os.path.join(self.phase1_dir, '*.pickle'))): + shutil.copy(full_name, targ_dir) + os.remove(full_name) + i +=1 + if i == node.capacity: + break + + # if the files have been in the node folder for more than wait_time hours, move them + # + refdt = time.time() - wait_time*3600 + log.info(f'moving any stale data assigned to {node.nodename}') + for full_name in glob.glob(os.path.join(node.dirpath, 'files', 'candidates', '*.pickle')): + if os.stat(full_name).st_mtime < refdt: + shutil.copy(full_name, self.candidate_dir) + os.remove(full_name) + for full_name in glob.glob(os.path.join(node.dirpath, 'files', 'phase1', '*.pickle')): + if os.stat(full_name).st_mtime < refdt: + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) - # Restore the signal functionality - signal.signal(signal.SIGINT, original_signal) + return + def getRemoteData(self, verbose=False): + """ + Used in 'child' mode: Wrapper around the remote data handling function to + download data from the master for local processing. + """ + if not self.RemoteDatahandler: + log.info('remote data handler not initialised') + return False + # collect candidates or phase1 solutions from the master node + if self.mc_mode == MCMODE_PHASE1 or self.mc_mode == MCMODE_BOTH: + status = self.RemoteDatahandler.collectRemoteData('candidates', self.output_dir, verbose=verbose) + elif mcmode == MCMODE_PHASE2: + status = self.RemoteDatahandler.collectRemoteData('phase1', self.output_dir, verbose=verbose) + else: + status = False + return status + + def getCandidateId(self, matched_observations, verbose=False): + """ + given a set of observations, create a candidate ID + + Parameters: + matched_observations: list of observations + + Returns: [string] candidate id + """ + + ref_dt = jd2Date(min([obs.jdt_ref for obs, _, _ in matched_observations]), dt_obj=True, tzinfo=datetime.timezone.utc) + ctry_list = list(set([met_obs.station_code[:2] for _, met_obs, _ in matched_observations])) + ctry_list.sort() + ctries = '_'.join(ctry_list) + cand_id = f'{ref_dt.timestamp():.6f}_{ctries}' + return cand_id + + + def saveCandidates(self, candidate_trajectories, verbose=False): + """ + Save candidates to file by constructing a name, checking if we already processed it and then + calling saveCandsorTraj if needed. The function checkAndAddCand adds to candidates.db so that we can + avoid reprocessing the same candidate on a future pass. + + Parameters: + candidate_trajectories : list of candidates + + """ + num_saved = 0 + for matched_observations in candidate_trajectories: + cand_id = self.getCandidateId(matched_observations) + ref_dt = jd2Date(min([obs.jdt_ref for obs, _, _ in matched_observations]), dt_obj=True, tzinfo=datetime.timezone.utc) + obs_ids = [met_obs.id for _, met_obs, _ in matched_observations] + + if self.candidate_db.checkAndAddCand(cand_id, ref_dt.timestamp(), obs_ids): + picklename = f'{cand_id}.pickle' + + if verbose: + log.info(f'Candidate {picklename} contains {len(matched_observations)} observations') + + if self.saveCandOrTraj(matched_observations, picklename, 'candidates', verbose=True): + num_saved += 1 + log.info(f'skipped {len(candidate_trajectories)-num_saved} as marked already-processed') + + log.info("-----------------------") + log.info(f'Saved {num_saved} candidates') + log.info("-----------------------") + + def saveCandOrTraj(self, traj, file_name, savetype='phase1', verbose=False): + """ + Save the candidates (if in candidate-finding mode) or phase 1 trajectories. + If remote data processing is enabled, this function distributes candidates amongst + any nodes that are in the relevant mode. + + Parameters: + traj : The trajectory or candidate to save + file_name : The filename to use + save_type : The type of object we're saving, 'phase1' or 'candidate'. + + """ + if savetype == 'phase1': + save_dir = self.phase1_dir + required_mode = MCMODE_PHASE2 + else: + save_dir = self.candidate_dir + required_mode = MCMODE_PHASE1 + + if self.RemoteDatahandler and self.RemoteDatahandler.mode == 'master': + + # Select a random bucket, check its not already full, and then save the pickle there. + # Make sure to break out once all buckets have been tested + # Fallback/default is to use the local dir. + tested_buckets = [] + bucket_num = -1 + bucket_list = self.RemoteDatahandler.nodes + bucket_list[-1].dirpath = save_dir + + while bucket_num not in tested_buckets: + bucket_num = secrets.randbelow(len(bucket_list)) + bucket = bucket_list[bucket_num] + + # if the child isn't the right mode, or the stop-flag exists, skip it + stop_sts = os.path.isfile(os.path.join(bucket.dirpath, 'files', 'stop')) + + if (bucket.mode != required_mode and bucket.mode != -1) or stop_sts: + tested_buckets.append(bucket_num) + continue + + #set a temporary save-dir name so we can check capacity + if bucket.nodename != 'localhost': + tmp_save_dir = os.path.join(bucket.dirpath, 'files', savetype) + else: + tmp_save_dir = save_dir + + os.makedirs(tmp_save_dir, exist_ok=True) + + current_workload = len(glob.glob(os.path.join(tmp_save_dir, '*.pickle'))) + if bucket.capacity < 0 or current_workload < bucket.capacity: + + # set the save dir if the bucket is usable + save_dir = tmp_save_dir + break + + tested_buckets.append(bucket_num) + + if verbose: + log.info(f'saving {file_name} to {save_dir}') + savePickle(traj, save_dir, file_name) + return True @@ -1717,7 +1887,7 @@ def _breakHandler(signum, frame): arg_parser.add_argument('dir_path', type=str, help='Path to the root data directory. Trajectory helper files will be stored here as well.') arg_parser.add_argument('-t', '--maxtoffset', metavar='MAX_TOFFSET', - help='Maximum time offset between the stations. Default is 5 seconds.', type=float, default=10.0) + help='Maximum time offset between the stations. Default is 10 seconds.', type=float, default=10.0) arg_parser.add_argument('-s', '--maxstationdist', metavar='MAX_STATION_DIST', help='Maximum distance (km) between stations of paired meteors. Default is 600 km.', type=float, @@ -1776,7 +1946,10 @@ def _breakHandler(signum, frame): help="Use best N stations in the solution (default is use 15 stations).") arg_parser.add_argument('--mcmode', '--mcmode', type=int, default=0, - help="Run just simple soln (1), just monte-carlos (2) or both (0, default).") + help="Operation mode - see readme. For standalone solving either don't set this or set it to 0") + + arg_parser.add_argument('--archivemonths', '--archivemonths', type=int, default=3, + help="Months back to archive old data. Default 3. Zero means don't archive (useful in testing).") arg_parser.add_argument('--maxtrajs', '--maxtrajs', type=int, default=None, help="Max number of trajectories to reload in each pass when doing the Monte-Carlo phase") @@ -1784,17 +1957,57 @@ def _breakHandler(signum, frame): arg_parser.add_argument('--autofreq', '--autofreq', type=int, default=360, help="Minutes to wait between runs in auto-mode") - arg_parser.add_argument('--remotehost', '--remotehost', type=str, default=None, - help="Remote host to collect and return MC phase solutions to. Supports internet-distributed processing.") - arg_parser.add_argument('--verbose', '--verbose', help='Verbose logging.', default=False, action="store_true") + arg_parser.add_argument('--addlogsuffix', '--addlogsuffix', help='add a suffix to the log to show what stage it is.', default=False, action="store_true") + # Parse the command line arguments cml_args = arg_parser.parse_args() ############################ - + db_dir = cml_args.dbdir + if db_dir is None: + db_dir = cml_args.dir_path + os.makedirs(db_dir, exist_ok=True) + + # mcmode values + # mcmode = 1 -> load candidates and do simple solutions + # mcmode = 2 -> load simple solns and do MC solutions + # mcmode = 4 -> find candidates only + # mcmode = 7 -> do everything + # mcmode = 0 -> same as mode 7 + # bitwise combinations are permissioble so: + # 4+1 will find candidates and then run simple solutions to populate "phase1" + # 1+2 will load candidates from "candidates" and solve them completely + + mcmode = MCMODE_ALL if cml_args.mcmode == 0 else cml_args.mcmode + + + mcmodestr = getMcModeStr(mcmode, 1) + pid_file = None + if mcmodestr: + pid_file = os.path.join(db_dir, f'.{mcmodestr}.pid') + open(pid_file,'w').write(f'{os.getpid()}') + + # signal handler created inline here as it needs access to db_dir + def signal_handler(sig, frame): + signal.signal(sig, signal.SIG_IGN) # ignore additional signals + log.info('======================================') + log.info('CTRL-C pressed, exiting gracefully....') + log.info('======================================') + remote_cfg = os.path.join(db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + rdh = RemoteDataHandler(remote_cfg) + if rdh and rdh.mode == 'child': + rdh.setStopFlag() + if os.path.isfile(pid_file): + os.remove(pid_file) + log.info('DONE') + log.info('======================================') + sys.exit(0) + + signal.signal(signal.SIGINT, signal_handler) ### Init logging - roll over every day ### @@ -1806,8 +2019,7 @@ def _breakHandler(signum, frame): log_dir = cml_args.dir_path # Create a log dir if it doesn't exist - if not os.path.isdir(log_dir): - os.makedirs(log_dir) + os.makedirs(log_dir, exist_ok=True) # Init the logger #log = logging.getLogger("traj_correlator") @@ -1821,6 +2033,11 @@ def _breakHandler(signum, frame): # Init the file handler timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") log_file = os.path.join(log_dir, f"correlate_rms_{timestamp}.log") + if cml_args.addlogsuffix: + modestr = getMcModeStr(cml_args.mcmode, 1) + if modestr: + log_file = os.path.join(log_dir, f"correlate_rms_{timestamp}_{modestr}.log") + file_handler = logging.handlers.TimedRotatingFileHandler(log_file, when="midnight", backupCount=7) file_handler.setFormatter(log_formatter) log.addHandler(file_handler) @@ -1869,21 +2086,14 @@ def _breakHandler(signum, frame): if cml_args.maxerr is not None: trajectory_constraints.max_arcsec_err = cml_args.maxerr - remotehost = cml_args.remotehost - if cml_args.mcmode !=2 and remotehost is not None: - log.info('remotehost only applicable in mcmode 2') - remotehost = None - + # set the maximum number of trajectories to reprocess when doing the MC uncertainties # set a default of 10 for remote processing and 1000 for local processing - if cml_args.remotehost is not None: - max_trajs = 10 - else: - max_trajs = 1000 + max_trajs = 1000 if cml_args.maxtrajs is not None: max_trajs = int(cml_args.maxtrajs) - if cml_args.mcmode == 2: + if mcmode == MCMODE_PHASE2: log.info(f'Reloading at most {max_trajs} phase1 trajectories.') # Set the number of CPU cores @@ -1893,8 +2103,22 @@ def _breakHandler(signum, frame): trajectory_constraints.mc_cores = cpu_cores log.info("Running using {:d} CPU cores.".format(cpu_cores)) + if mcmode == MCMODE_CANDS: + log.info('Saving Candidates only') + elif mcmode == MCMODE_PHASE1: + log.info('Loading Candidates if needed') + elif mcmode == MCMODE_ALL: + log.info('Full processing mode') + + if cml_args.verbose: + log.info('verbose flag set') + verbose = True + else: + verbose = False + # Run processing. If the auto run more is not on, the loop will break after one run previous_start_time = None + while True: # Clock for measuring script time @@ -1947,12 +2171,12 @@ def _breakHandler(signum, frame): # Init the data handle dh = RMSDataHandle( - cml_args.dir_path, dt_range=event_time_range, - db_dir=cml_args.dbdir, output_dir=cml_args.outdir, - mcmode=cml_args.mcmode, max_trajs=max_trajs, remotehost=remotehost, verbose=cml_args.verbose) + cml_args.dir_path, dt_range=event_time_range, db_dir=cml_args.dbdir, output_dir=cml_args.outdir, + mcmode=mcmode, max_trajs=max_trajs, verbose=verbose, archivemonths=cml_args.archivemonths, auto=cml_args.auto, + max_toffset=cml_args.maxtoffset) - # If there is nothing to process, stop, unless we're in mcmode 2 (processing_list is not used in this case) - if not dh.processing_list and cml_args.mcmode < 2: + # If there is nothing to process and we're in Candidate mode, stop + if not dh.processing_list and (mcmode & MCMODE_CANDS): log.info("") log.info("Nothing to process!") log.info("Probably everything is already processed.") @@ -1962,7 +2186,7 @@ def _breakHandler(signum, frame): ### GENERATE DAILY TIME BINS ### - if cml_args.mcmode != 2: + if mcmode != MCMODE_PHASE2: # Find the range of datetimes of all folders (take only those after the year 2000) proc_dir_dts = [entry[3] for entry in dh.processing_list if entry[3] is not None] proc_dir_dts = [dt for dt in proc_dir_dts if dt > datetime.datetime(2000, 1, 1, 0, 0, 0, @@ -1980,14 +2204,19 @@ def _breakHandler(signum, frame): if proc_dir_dts == []: proc_dir_dts=[dt_beg - datetime.timedelta(days=1), dt_end + datetime.timedelta(days=1)] - # Determine the limits of data + # Determine the limits of data - add one day to proc_dir_dt_end because each proc dir holds + # up to one day's worth of detections proc_dir_dt_beg = min(proc_dir_dts) - proc_dir_dt_end = max(proc_dir_dts) + proc_dir_dt_end = max(proc_dir_dts) + datetime.timedelta(days=1) + + # in candidate-only mode, we want to write data out frequently so that the solvers can get to work. + # Hence set the bin-size to 6 hours. + bin_length = 0.25 if mcmode == MCMODE_CANDS else 1.0 # Split the processing into daily chunks dt_bins = generateDatetimeBins( proc_dir_dt_beg, proc_dir_dt_end, - bin_days=1, tzinfo=datetime.timezone.utc, reverse=False) + bin_days=bin_length, tzinfo=datetime.timezone.utc, reverse=False) # check if we've created an extra bucket (might happen if requested timeperiod is less than 24h) if event_time_range is not None: @@ -1998,12 +2227,13 @@ def _breakHandler(signum, frame): dt_bins = [(dh.dt_range[0], dh.dt_range[1])] if dh.dt_range is not None: - # there's some data to process - log.info("") - log.info("ALL TIME BINS:") - log.info("----------") - for bin_beg, bin_end in dt_bins: - log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) + # there's some data to process and we're in candidate mode + if mcmode & MCMODE_CANDS: + log.info("") + log.info("ALL TIME BINS:") + log.info("----------") + for bin_beg, bin_end in dt_bins: + log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) ### ### @@ -2012,27 +2242,62 @@ def _breakHandler(signum, frame): # Go through all chunks in time for bin_beg, bin_end in dt_bins: - log.info("") - log.info("PROCESSING TIME BIN:") - log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) - log.info("-----------------------------") - log.info("") + if mcmode & MCMODE_CANDS: + log.info("") + log.info("PROCESSING TIME BIN:") + log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) + log.info("-----------------------------") + log.info("") - # Load data of unprocessed observations - if cml_args.mcmode != 2: dh.unpaired_observations = dh.loadUnpairedObservations(dh.processing_list, dt_range=(bin_beg, bin_end)) + log.info(f'loaded {len(dh.unpaired_observations)} observations') + + if mcmode != MCMODE_PHASE2: + + # Update the trajectory database, removing any that no longer exist on disk, + # adding any that exist on disk but are missing in the database, and + # removing any duplicates from both disk and database + + dh.updateTrajectoryDatabase(dt_range=(bin_beg, bin_end)) - # refresh list of calculated trajectories from disk - dh.removeDeletedTrajectories() - dh.loadComputedTrajectories(os.path.join(dh.output_dir, OUTPUT_TRAJ_DIR), dt_range=[bin_beg, bin_end]) - if cml_args.mcmode != 2: - dh.removeDuplicateTrajectories(dt_range=[bin_beg, bin_end]) # Run the trajectory correlator tc = TrajectoryCorrelator(dh, trajectory_constraints, cml_args.velpart, data_in_j2000=True, enableOSM=cml_args.enableOSM) bin_time_range = [bin_beg, bin_end] - tc.run(event_time_range=event_time_range, mcmode=cml_args.mcmode, bin_time_range=bin_time_range) + num_done = tc.run(event_time_range=event_time_range, mcmode=mcmode, bin_time_range=bin_time_range, verbose=verbose) + + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child' and num_done > 0: + log.info('uploading to master node') + # close the databases and upload the data to the master node + if mcmode != MCMODE_PHASE2: + dh.closeTrajectoryDatabase() + dh.closeObservationsDatabase() + + + if dh.RemoteDatahandler.uploadToMaster(dh.output_dir, verbose=verbose): + + # if we successfully uploaded data, truncate the tables here so they are clean for the next run + # otherwise do not truncate it, so we push it next time instead + if mcmode != MCMODE_PHASE2: + dh.trajectory_db = TrajectoryDatabase(dh.db_dir, purge_records=True) + dh.observations_db = ObservationsDatabase(dh.db_dir, purge_records=True) + + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'master': + # move any uploaded data and then check and rebalance any pending cands or phase1s + dh.moveUploadedData(verbose=verbose) + dh.checkAndRedistribCands(wait_time=6, verbose=verbose) + + # If we're in either of these modes, the correlator will have scooped up available data + # from candidates or phase1 folders so no need to keep looping. + if mcmode == MCMODE_PHASE1 or mcmode == MCMODE_PHASE2 or mcmode == MCMODE_BOTH: + break + + if mcmode & MCMODE_CANDS: + dh.closeObservationsDatabase() + dh.closeCandidatesDatabase() + dh.closeTrajectoryDatabase() + else: # there were no datasets to process log.info('no data to process yet') @@ -2042,16 +2307,30 @@ def _breakHandler(signum, frame): # Store the previous start time previous_start_time = copy.deepcopy(t1) + + # Break after one loop if auto mode is not on if cml_args.auto is None: + # clear the remote data ready flag to indicate we're shutting down + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child': + dh.RemoteDatahandler.setStopFlag() + if pid_file and os.path.isfile(pid_file): + os.remove(pid_file) break else: - + if dh.observations_db: + dh.closeObservationsDatabase() + if dh.trajectory_db: + dh.closeTrajectoryDatabase() # Otherwise wait to run AUTO_RUN_FREQUENCY hours after the beginning wait_time = (datetime.timedelta(hours=AUTO_RUN_FREQUENCY) - (datetime.datetime.now(datetime.timezone.utc) - t1)).total_seconds() + # remove the remote data stop flag to indicate we're open for business + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child': + dh.RemoteDatahandler.clearStopFlag() + # Run immediately if the wait time has elapsed if wait_time < 0: continue @@ -2070,4 +2349,4 @@ def _breakHandler(signum, frame): while next_run_time > datetime.datetime.now(datetime.timezone.utc): print("Waiting {:s} to run the trajectory solver... ".format(str(next_run_time - datetime.datetime.now(datetime.timezone.utc)))) - time.sleep(2) + time.sleep(10) diff --git a/wmpl/Utils/Math.py b/wmpl/Utils/Math.py index bb6069b5..d916bc28 100644 --- a/wmpl/Utils/Math.py +++ b/wmpl/Utils/Math.py @@ -1113,11 +1113,13 @@ def generateDatetimeBins(dt_beg, dt_end, bin_days=7, utc_hour_break=12, tzinfo=N else: bin_beg = dt_beg + datetime.timedelta(days=i * bin_days) - bin_beg = bin_beg.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) + if bin_days > 0.999: + bin_beg = bin_beg.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) # Generate the bin ending edge bin_end = bin_beg + datetime.timedelta(days=bin_days) - bin_end = bin_end.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) + if bin_days > 0.999: + bin_end = bin_end.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) # Check that the ending bin is not beyond the end dt end_reached = False diff --git a/wmpl/Utils/remoteDataHandling.py b/wmpl/Utils/remoteDataHandling.py index 59f59a19..8f04a276 100644 --- a/wmpl/Utils/remoteDataHandling.py +++ b/wmpl/Utils/remoteDataHandling.py @@ -23,176 +23,349 @@ import os import paramiko import logging -import glob import shutil +import uuid +import time -from wmpl.Utils.OSTools import mkdirP -from wmpl.Utils.Pickling import loadPickle +from configparser import ConfigParser log = logging.getLogger("traj_correlator") -def collectRemoteTrajectories(remotehost, max_trajs, output_dir): - """ - Collect trajectory pickles from a remote server for local phase2 (monte-carlo) processing - NB: do NOT use os.path.join here, as it will break on Windows - """ +class RemoteNode(): + def __init__(self, nodename, dirpath, capacity, mode, active=False): + self.nodename = nodename + self.dirpath = dirpath + self.capacity = int(capacity) + self.mode = int(mode) + self.active = active - ftpcli, remote_dir, sshcli = getSFTPConnection(remotehost) - if ftpcli is None: - return - - remote_phase1_dir = os.path.join(remote_dir, 'phase1').replace('\\','/') - - log.info(f'Looking in {remote_phase1_dir} on remote host for up to {max_trajs} trajectories') - try: - files = ftpcli.listdir(remote_phase1_dir) - files = [f for f in files if '.pickle' in f and 'processing' not in f] - files = files[:max_trajs] - - if len(files) == 0: - log.info('no data available at this time') - ftpcli.close() - sshcli.close() - return +class RemoteDataHandler(): + def __init__(self, cfg_file): + self.initialised = False + if not os.path.isfile(cfg_file): + log.warning(f'unable to find {cfg_file}, not enabling remote processing') + return - for trajfile in files: - fullname = os.path.join(remote_phase1_dir, trajfile).replace('\\','/') - localname = os.path.join(output_dir, trajfile) - ftpcli.get(fullname, localname) - ftpcli.rename(fullname, f'{fullname}_processing') - - log.info(f'Obtained {len(files)} trajectories') - - - except Exception as e: - log.warning('Problem with download') - log.info(e) - - ftpcli.close() - sshcli.close() - - return + self.nodenames = None + self.nodes = None + self.capacity = None + self.host = None + self.user = None + self.key = None -def uploadTrajToRemote(remotehost, trajfile, output_dir): - """ - At the end of MC phase, upload the trajectory pickle and report to a remote host for integration - into the solved dataset - """ - - ftpcli, remote_dir, sshcli = getSFTPConnection(remotehost) - if ftpcli is None: + self.ssh_client = None + self.sftp_client = None + + cfg = ConfigParser() + cfg.read(cfg_file) + self.mode = cfg['mode']['mode'].lower() + if self.mode not in ['master', 'child']: + log.warning('remote cfg: mode must be master or child, not enabling remote processing') + return + if self.mode == 'master': + if 'children' not in cfg.sections(): + log.warning('remote cfg: children section missing, not enabling remote processing') + return + + # create a list of available nodes, disabling any that are malformed in the config file + self.nodenames = [k for k in cfg['children'].keys()] + self.nodes = [k.split(',') for k in cfg['children'].values()] + self.nodes = [RemoteNode(nn,x[0],x[1],x[2]) for nn,x in zip(self.nodenames,self.nodes) if len(x)==3] + self.nodes.append(RemoteNode('localhost', None, -1, -1)) + activenodes = [n.nodename for n in self.nodes if n.capacity!=0] + log.info(f' using nodes {activenodes}') + else: + # 'child' mode + if 'sftp' not in cfg.sections() or 'key' not in cfg['sftp'] or 'host' not in cfg['sftp'] or 'user' not in cfg['sftp']: + log.warning('remote cfg: sftp user, key or host missing, not enabling remote processing') + return + + self.host = cfg['sftp']['host'] + self.user = cfg['sftp']['user'] + self.key = os.path.normpath(os.path.expanduser(cfg['sftp']['key'])) + if 'port' not in cfg['sftp']: + self.port = 22 + else: + self.port = int(cfg['sftp']['port']) + + self.initialised = True return - - remote_phase2_dir = os.path.join(remote_dir, 'remoteuploads').replace('\\','/') - try: - ftpcli.mkdir(remote_phase2_dir) - except Exception: - pass - - localname = os.path.join(output_dir, trajfile) - remotename = os.path.join(remote_phase2_dir, trajfile).replace('\\','/') - ftpcli.put(localname, remotename) - localname = localname.replace('_trajectory.pickle', '_report.txt') - remotename = remotename.replace('_trajectory.pickle', '_report.txt') - if os.path.isfile(localname): - ftpcli.put(localname, remotename) - - ftpcli.close() - sshcli.close() - return - - -def moveRemoteTrajectories(output_dir): - """ - Move remotely processed pickle files to their target location in the trajectories area, - making sure we clean up any previously-calculated trajectory and temporary files - """ - - phase2_dir = os.path.join(output_dir, 'remoteuploads') - - if os.path.isdir(phase2_dir): - log.info('Checking for remotely calculated trajectories...') - pickles = glob.glob1(phase2_dir, '*.pickle') - - for pick in pickles: - traj = loadPickle(phase2_dir, pick) - phase1_name = traj.pre_mc_longname - traj_dir = f'{output_dir}/trajectories/{phase1_name[:4]}/{phase1_name[:6]}/{phase1_name[:8]}/{phase1_name}' - if os.path.isdir(traj_dir): - shutil.rmtree(traj_dir) - processed_traj_file = os.path.join(output_dir, 'phase1', phase1_name + '_trajectory.pickle_processing') - - if os.path.isfile(processed_traj_file): - log.info(f' Moving {phase1_name} to processed folder...') - dst = os.path.join(output_dir, 'phase1', 'processed', phase1_name + '_trajectory.pickle') - shutil.copyfile(processed_traj_file, dst) - os.remove(processed_traj_file) - - phase2_name = traj.longname - traj_dir = f'{output_dir}/trajectories/{phase2_name[:4]}/{phase2_name[:6]}/{phase2_name[:8]}/{phase2_name}' - mkdirP(traj_dir) - log.info(f' Moving {phase2_name} to {traj_dir}...') - src = os.path.join(phase2_dir, pick) - dst = os.path.join(traj_dir, pick[:15]+'_trajectory.pickle') - - shutil.copyfile(src, dst) - os.remove(src) - - report_file = src.replace('_trajectory.pickle','_report.txt') - if os.path.isfile(report_file): - dst = dst.replace('_trajectory.pickle','_report.txt') - shutil.copyfile(report_file, dst) - os.remove(report_file) - - log.info(f'Moved {len(pickles)} trajectories.') + def getSFTPConnection(self, verbose=False): + if not self.initialised: + return False + + if self.sftp_client: + return True + + log.info(f'Connecting to {self.host}:{self.port} as {self.user}....') - return + if not os.path.isfile(os.path.expanduser(self.key)): + log.warning(f'ssh keyfile {self.key} missing') + return False + + self.ssh_client = paramiko.SSHClient() + if verbose: + log.info('created paramiko ssh client....') + self.ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + pkey = paramiko.RSAKey.from_private_key_file(self.key) + try: + if verbose: + log.info('connecting....') + self.ssh_client.connect(hostname=self.host, username=self.user, port=self.port, + pkey=pkey, look_for_keys=False, timeout=10) + if verbose: + log.info('connected....') + self.sftp_client = self.ssh_client.open_sftp() + if verbose: + log.info('created client') + return True + + except Exception as e: + log.warning('sftp connection to remote host failed') + log.warning(e) + self.closeSFTPConnection() + return False + + def closeSFTPConnection(self): + if self.sftp_client: + self.sftp_client.close() + self.sftp_client = None + if self.ssh_client: + self.ssh_client.close() + self.ssh_client = None + return + + def putWithRetry(self, local_name, remname): + for i in range(10): + try: + self.sftp_client.put(local_name, remname) + break + except Exception: + time.sleep(1) + if i == 10: + log.warning(f'upload of {local_name} failed after 10 retries') + return False + return True + + ######################################################## + # functions used by the client nodes + + def collectRemoteData(self, datatype, output_dir, verbose=False): + """ + Collect trajectory or candidate pickles from a remote server for local processing + + parameters: + datatype = 'candidates' or 'phase1' + output_dir = folder to put the pickles into generally dh.output_dir + """ + + if not self.initialised or not self.getSFTPConnection(verbose=verbose): + return False + + for pth in ['files', 'files/candidates', 'files/phase1', 'files/trajectories', + 'files/candidates/processed','files/phase1/processed']: + try: + self.sftp_client.mkdir(pth) + except Exception: + pass + self.sftp_client.chmod(pth, 0o777) + + try: + rem_dir = f'files/{datatype}' + files = self.sftp_client.listdir(rem_dir) + files = [f for f in files if '.pickle' in f and 'processing' not in f] + if len(files) == 0: + log.info('no data available at this time') + self.closeSFTPConnection() + return False + + local_dir = os.path.join(output_dir, datatype) + if not os.path.isdir(local_dir): + os.makedirs(local_dir, exist_ok=True) + for trajfile in files: + fullname = f'{rem_dir}/{trajfile}' + localname = os.path.join(local_dir, trajfile) + if verbose: + log.info(f'downloading {fullname} to {localname}') + for i in range(10): + try: + self.sftp_client.get(fullname, localname) + break + except Exception: + time.sleep(1) + try: + self.sftp_client.rename(fullname, f'{rem_dir}/processed/{trajfile}') + except: + try: + self.sftp_client.remove(fullname) + except: + log.info(f'unable to rename or remove {fullname}') + + log.info(f'Obtained {len(files)} {"trajectories" if datatype=="phase1" else "candidates"}') + + except Exception as e: + log.warning('Problem with download') + log.info(e) + + self.closeSFTPConnection() + return True + + def uploadToMaster(self, source_dir, verbose=False): + """ + upload the trajectory pickle and report to a remote host for integration + into the solved dataset + + parameters: + source_dir = root folder containing data, generally dh.output_dir + """ + + if not self.initialised or not self.getSFTPConnection(verbose=verbose): + return + + # flag to indicate success. Any upload failures will set this to False + success_flag = True + + for pth in ['files', 'files/candidates', 'files/phase1', 'files/trajectories', + 'files/candidates/processed','files/phase1/processed']: + try: + self.sftp_client.mkdir(pth) + self.sftp_client.chmod(pth, 0o777) + except Exception: + pass + + phase1_dir = os.path.join(source_dir, 'phase1') + if os.path.isdir(phase1_dir): + + # upload any phase1 trajectories + i=0 + proc_dir = os.path.join(phase1_dir, 'processed') + os.makedirs(proc_dir, exist_ok=True) + + for fil in os.listdir(phase1_dir): + local_name = os.path.join(phase1_dir, fil) + if os.path.isdir(local_name): + continue + remname = f'files/phase1/{fil}' + + if verbose: + log.info(f'uploading {local_name} to {remname}') + + # If the upload is successful, move the local file to 'processed' + # Otherwise set the success flag to false + + if self.putWithRetry(local_name, remname): + + if os.path.isfile(os.path.join(proc_dir, fil)): + os.remove(os.path.join(proc_dir, fil)) + shutil.move(local_name, proc_dir) + i += 1 + + else: + success_flag = False + + if i > 0: + log.info(f'uploaded {i} phase1 solutions') + + # now upload any data in the 'trajectories' folder, flattening it to make it simpler to handle + i=0 + if os.path.isdir(os.path.join(source_dir, 'trajectories')): + traj_dir = f'{source_dir}/trajectories' + for (dirpath, dirnames, filenames) in os.walk(traj_dir): + if len(filenames) > 0: + + # flag to indicate whether this specific trajectory upload succeeded + traj_success_flag = True + + rem_path = f'files/trajectories/{os.path.basename(dirpath)}' + try: + self.sftp_client.mkdir(rem_path) + self.sftp_client.chmod(rem_path, 0o777) + except Exception: + pass + + # upload all files in the folder. If any upload fails, set the traj sucess flag to false + for fil in filenames: + + local_name = os.path.join(dirpath, fil) + rem_file = f'{rem_path}/{fil}' + + if verbose: + log.info(f'uploading {local_name} to {rem_file}') + + if self.putWithRetry(local_name, rem_file): + i += 1 + else: + traj_success_flag = False -def getSFTPConnection(remotehost): + # if this trajectory uploaded, remove the local files + # Otherwise set the overall status to False + if traj_success_flag: + shutil.rmtree(dirpath, ignore_errors=True) + else: + success_flag = traj_success_flag - hostdets = remotehost.split(':') + + if i > 0: + log.info(f'uploaded {int(i/2)} trajectories') - if len(hostdets) < 2 or '@' not in hostdets[0]: - log.warning(f'{remotehost} malformed, should be user@host:port:/path/to/dataroot') - return None, None, None - - if len(hostdets) == 3: - port = int(hostdets[1]) - remote_data_dir = hostdets[2] + # if everything uploaded we can remove the entire 'trajectories' folder + if success_flag: + shutil.rmtree(traj_dir, ignore_errors=True) - else: - port = 22 - remote_data_dir = hostdets[1] + # finally the databases - upload these with a random name for uniqueness at the server side + # Again, if any upload fails mark the status False + uuid_str = str(uuid.uuid4()) - user,host = hostdets[0].split('@') - log.info(f'Connecting to {host}....') + db_success_flag = True + for fname in ['observations', 'trajectories']: + local_name = os.path.join(source_dir, f'{fname}.db') + if os.path.isfile(local_name): + rem_file = f'files/{fname}-{uuid_str}.db' + + if verbose: + log.info(f'uploading {local_name} to {rem_file}') + + if not self.putWithRetry(local_name, rem_file): + db_success_flag = False - ssh_client = paramiko.SSHClient() - ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + if db_success_flag: + log.info('uploaded databases') + else: + log.warning('unable to upload at least one of the databases, will retry in next loop') + success_flag = db_success_flag + self.closeSFTPConnection() - if not os.path.isfile(os.path.expanduser('~/.ssh/trajsolver')): - log.warning('ssh keyfile ~/.ssh/trajsolver missing') - ssh_client.close() - return None, None, None - - pkey = paramiko.RSAKey.from_private_key_file(os.path.expanduser('~/.ssh/trajsolver')) - try: - ssh_client.connect(hostname=host, username=user, port=port, pkey=pkey, look_for_keys=False) - ftp_client = ssh_client.open_sftp() - return ftp_client, remote_data_dir, ssh_client + return success_flag - except Exception as e: - - log.warning('sftp connection to remote host failed') - log.warning(e) - ssh_client.close() - - return None, None, None + def setStopFlag(self, verbose=False): + if not self.initialised or not self.getSFTPConnection(): + return + try: + readyfile = os.path.join(os.getenv('TMP', default='/tmp'),'stop') + open(readyfile,'w').write('stop') + self.sftp_client.put(readyfile, 'files/stop') + except Exception: + log.warning('unable to set stop flag, master will not continue to assign data') + time.sleep(2) + self.closeSFTPConnection() + log.info('set stop flag') + return + + def clearStopFlag(self, verbose=False): + if not self.initialised or not self.getSFTPConnection(): + return + try: + self.sftp_client.remove('files/stop') + log.info('removed stop flag') + except: + pass + self.closeSFTPConnection() + return diff --git a/wmpl_remote.cfg.sample b/wmpl_remote.cfg.sample new file mode 100644 index 00000000..da8c6cc4 --- /dev/null +++ b/wmpl_remote.cfg.sample @@ -0,0 +1,31 @@ +# Configuration file for WMPL distributed processing. +# Rename to `wmpl_remote.cfg` and place in the data directory. + +[mode] +# if mode is 'master' then [children] lists the child processing nodes and capacity of each +# if mode is 'child' then the [sftp] section says how the child will connect to the parent. +# Each node must have its own copy of this file and each child must have its own credentials + +mode = child + +# details of the child nodes. Each line must have three values separated by commas +# * folder that each node will use. These must map to each node's sftp user's homedir. +# * capacity - number of candidates or trajectories to solve. Allows loadbalancing +# * operation mode - 0: disabled, 1: solving candidates, 2: monte-carlo phase +# the node names can be used to differentiate children + +[children] +node1 = c:/temp/wmpl/node1,200,1 +node2 = c:/temp/wmpl/node2,400,1 +node3 = c:/temp/wmpl/node3,0,2 +node4 = c:/temp/wmpl/node4,0,2 +node5 = + + +[sftp] +# sftp login details for client to connect to the parent when running in 'child' mode +# if the port is nonstandard (ie not 22) then uncomment and set as required +host = testserver.somedomain.com +user = node1 +key = ~/.ssh/somekey +#port=2222 \ No newline at end of file