-
Notifications
You must be signed in to change notification settings - Fork 167
Description
We should phase out the old Trac wiki, and port its contents to GitHub instead. (I will push the relevant stuff from the existing trac.db SQLite database separately.)
What the dump gives us is the page name, version number, Unix timestamp, Trac author, actual page content, and comment. These should be used as follows:
- page name --> file name (and file path for names that start with "Archive/" etc.)
- version number + comment --> commit message (a la
"Add version " + VERSION + " of page '" + TITLE + "'\n\n" + COMMENT) - Unix timestamp --> commit timestamp
- Trac author --> GitHub ID (as far as we know it)
- Page contents --> file contents
For example, a page named CHinst by albert in version 1 of date 1495729526 should become a file of the same name, author aaaaalbert, and appropriate commit message / timestamp.
The same page name in version 2 of date 1495814460 should go to the same file name, again with appropriate message / timestamp.
Thus, the conversion needs to start with the earliest Trac timestamp, and work its way forward in time.
Here is how I generate the wiki dump (generally following the ideas in https://gist.github.com/sgk/1286682 ):
import sqlite3
import pickle
SQL = '''
select
name, version, time, author, text, comment
from
wiki w
'''
conn = sqlite3.connect('trac.db')
result = conn.execute(SQL)
outfile = open("trac_wiki_dump.pickle", "wb")
outlist = []
for line in result:
outlist.append(line)
pickle.dump(outlist, outfile)
outfile.close()