Skip to content

Gzipped sitemap does not work #25

@jeroenvermeulen

Description

@jeroenvermeulen

Shopware 6.5 creates a sitemap index with gzipped URLs, for example:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz</loc>
    <lastmod>2025-03-31T16:43:34+02:00</lastmod>
  </sitemap>
</sitemapindex>

When I try to crawl it:

$ crowlet https://domain.com/sitemap.xml
INFO[0000] Crawling https://domain.com/sitemap.xml
ERRO[0001] failed to parse https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz in sitemapindex.xml: XML syntax error on line 1: illegal character code U+001F
FATA[0001] failed to parse https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz in sitemapindex.xml: XML syntax error on line 1: illegal character code U+001F

When I try to crawl the GZ URL directly:

$ crowlet https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz
INFO[0000] Crawling https://tapeconcurrent.deovero.dev/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz
ERRO[0000] URL is not a sitemap or sitemapindex: https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz
FATA[0000] URL is not a sitemap or sitemapindex: https://domain.com/sitemap/salesChannel-hash-hash/hash-sitemap-domain-com-1.xml.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions