Skip to content

Export parts of the site #86

@gforcada

Description

@gforcada

Is your feature request related to a problem? Please describe.

If a site is huge (millions of objects), it is not feasible to use plone.exportimport, as the generated folder is on the +10Gb order.

That's impractical for either a developer to download it to work locally, or to bundle it on a Docker image to pass it to external partners (designers, integrators...).

On top of that, a full export might contain sensible data that one might not want to export.

Describe the solution you'd like

Provide another export option that allows the user to graft parts of a site. Either MANIFEST.in like, i.e.:

include /some-important-folder
exclude /private-folder

Or simply a plain list of paths:

/path/to/a/deep/nested/document
/path/to/another/document
/list/goes/one

How each of these files are to be generated would be up to the user, they know their content, so they probably know best how to generate it.

Another option would be add a IExportMe marker interface, they via some actions/forms you can mark the objects that you want to be exported, and plone.exportimport looks for that content via a catalog query.

This one is probably not that flexible and appealing (IMHO, but YMMV 😅 ). The second option (a plain list) looks much more feasible to me.

Describe alternatives you've considered

So far we have hacked a few views built on top of collective.exportimport functionality, but it is brittle. plone.exportimport export folder structure and command line approach looks much more suitable to this task.

Additional context

Our use case is double:

  • get internal devs with fresh data on a daily basis (given that this export described above is performed on a daily basis on a production database copy), so they can fix bugs and reproduce problems easier 🚀
  • get a safe data subset to external partners (real use case): an external provider needs to interact with our Plone site, but we don't want to make that external provider go through the hassle of getting all our code, building a Plone instance, and teaching our (quite complex) workflows that you need to go through to build the front page. Solution? Get this partial export, lump all that together with a Docker image of our fresh code straight from some specific in-development branches, so they can test our new code (that has not reached production still) on real data without any configuration hassle ✨

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions