Skip to content

cwb_makeall() may ignore registry provided #31

@ablaette

Description

@ablaette

For cwb_makeall() I offer the following example:

registry <- if (!check_pkg_registry_files()) use_tmp_registry() else get_pkg_registry()
home_dir <- system.file(package = "RcppCWB", "extdata", "cwb", "indexed_corpora", "unga")

tmpdir <- normalizePath(tempdir(), winslash = "/")
tmp_regdir <- file.path(tmpdir, "registry_tmp", fsep = "/")
tmp_data_dir <- file.path(tmpdir, "indexed_corpora", fsep = "/")
tmp_unga_dir <- file.path(tmp_data_dir, "unga", fsep = "/")
if (!file.exists(tmp_regdir)) dir.create(tmp_regdir)
if (!file.exists(tmp_data_dir)) dir.create(tmp_data_dir)
if (!file.exists(tmp_unga_dir)){
   dir.create(tmp_unga_dir)
} else {
  file.remove(list.files(tmp_unga_dir, full.names = TRUE))
}
regfile <- readLines(file.path(registry, "unga"))
regfile[grep("^HOME", regfile)] <- sprintf('HOME "%s"', tmp_unga_dir)
writeLines(text = regfile, con = file.path(tmp_regdir, "unga"))
for (x in list.files(home_dir, full.names = TRUE)){
 file.copy(from = x, to = tmp_unga_dir)
}

# perform cwb_makeall (equivalent to cwb-makeall command line utility)
cwb_makeall(corpus = "UNGA", p_attribute = "word", registry = tmp_regdir)

Surprisingly, the files generated are not written to tmp_unga_dir provided as the home directory in the registry file unga in the tmp_regdir, but to the unga directory within the installed package.

My hypothesis is that the registry directory provided in the function call cwb_makeall() is ignored, if the corpus is already loaded. So cl_delete_corpus("UNGA") is necessary to trigger reloading the corpus.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions