Skip to content
This repository was archived by the owner on Feb 7, 2019. It is now read-only.
This repository was archived by the owner on Feb 7, 2019. It is now read-only.

unknown tags when importing, and missing the CREATE TABLE statement #7

@dportabella

Description

@dportabella

I am trying running the example of the README.ME, but this file does not exist: http://download.wikimedia.org/enwiki/pages-meta-current.xml.bz2

instead, I downloaded this one: https://dumps.wikimedia.org/enwiki/20160305/enwiki-20160305-pages-meta-current.xml.bz2, but xml2sql fails to import it. Is this file a valid input for your program?

$ bunzip2 -c enwiki-20160305-pages-meta-current.xml.bz2 | xml2sql -m
unexpected element <dbname>
xml2sql: parsing aborted at line 4 pos 12.

It works (it creates page.sql, revision.sql and text.sql) if I remove some the tags as follows:
$ cat enwiki-20160305-pages-meta-current1.xml-p000000010p000030303 | egrep -v "<dbname>|<ns>|<redirect|<parentid>|<model>|<format>|<sha1>" | xml2sql -m
does this mean that the wikipedia format has evolved and mediawiki-xml2sql needs to be updated?
or is there an alternative tool to achieve the same thing?

also, the three generated sql files have INSERT INTO statements, but the CREATE TABLE statement is missing. Can you please tell me the required CREATE TABLE statement?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions