Skip to content

cqp_get_registry(): Initialization of CQP required #14

@ablaette

Description

@ablaette

Trying to implement a function in the cwbtools package that performs a testload for a corpus,
I realized some really odd behavior of the cqp_get_registry() function.

The cqp_get_registry() works as expected, if cqp_initialize() has been
called before tbe first call of cqp_get_registry(). Given the two
alternative registry directories used in the following example exist,
the following code works.

library(RcppCWB)
cqp_initialize()
cqp_reset_registry(path.expand("~/cwb2/registry"))
cqp_get_registry()

Sys.setenv(CORPUS_REGISTRY = path.expand("~/Data/cwb/registry"))
cqp_get_registry()

But if cqp_get_registry() is called before cqp_initialize(),
cqp_get_registry() will grab strings from memory in an unexpected and
definitely unintended manner.

library(RcppCWB)
cqp_get_registry() # Problems when called before cqp_initialize
cqp_initialize()
cqp_get_registry()

Looking at the C code of the cl_standard_registry() C function that is
the worker behind the R function cqp_get_registry(), I find it hard
(impossible at this stage) to understand what happens.

cl_standard_registry()
{
  if (regdir == NULL)
    regdir = getenv(REGISTRY_ENVVAR);
  if (regdir == NULL)
    regdir = REGISTRY_DEFAULT_PATH;
  return regdir;
}

A workaround might be to call cqp_initialize() by default upon loading
RcppCWB. But that would be a workaround and I am deeply dissatisfied with, as I do not
yet understand the odd behavior reported.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions