NNPDF · scarlehoff · Mar 9, 2023 · Feb 24, 2023 · Feb 24, 2023 · Feb 24, 2023
diff --git a/doc/sphinx/source/data/plotting_format.md b/doc/sphinx/source/data/plotting_format.md
@@ -85,9 +85,8 @@ a substring). Currently they are:
 'SIA': ('$z$', '$Q^2 (GeV^2)$', '$y$')
 ```
 
-This mapping is declared as `CommonData.kinLabel_latex` in the C++
-code (and accessible as `validphys.plotoptions.core.kinlabels_latex`
-in the Python code).
+This mapping is declared as `validphys.commondataparser.KINLABEL_LATEX`
+in the python code.
 
 The three kinematic variables are referred to as `k1`, `k2` and `k3`
 in the plotting files. For example, for DIS processes, `k1` refers to `x`,

diff --git a/doc/sphinx/source/tutorials/addspecialgrouping.rst b/doc/sphinx/source/tutorials/addspecialgrouping.rst
@@ -82,11 +82,6 @@ which tells the code to look for an ``nnpdf40_process`` key within the metadata
 to attempt to parse it as a string. We do not attribute a default value to this new key,
 which implies that it must be provided within the metadata file.
 
-.. note::
-   The reason the group name should be a string is because it is sometimes
-   passed to the C++ code through the SWIG interface, which is very strict
-   about the typing you use.
-
 In addition to this, you must add the new grouping to
 :py:class:`validphys.plotoptions.core.PlotInfo` as a keyword arguments of
 the ``__init__`` function and subsequently as an attribute of the class

diff --git a/doc/sphinx/source/vp/custom_pipelines.rst b/doc/sphinx/source/vp/custom_pipelines.rst
@@ -138,9 +138,7 @@ a missing resource. The functions of type `check_<resource>` should
 take the information processed by the Config class and verify that
 a given resource is correct. If so, they should return a "Resource
 specification" (something typically containing metadata information
-such as paths, and a `load()` method to get the C++ object from
-`libnnpdf`). We also define a `get` method that returns the C++ object
-directly.
+such as paths, which are necessary to load the final commondata or fktable)
 
 In the case of the positivity set, this is entirely given in terms of
 existing check functions:
@@ -160,31 +158,28 @@ existing check functions:
 A more complicated example should raise the appropriate loader
 errors (see the other examples in the class).
 
-The `PositivitySetSpec` could be defined roughly like:
+The `PositivitySet` inherits in the code from `DataSetSpec`
+but one could roughly define it as:
 
 .. code:: python
 
-	 class PositivitySetSpec():
-	     def __init__(self, commondataspec, fkspec, poslambda, thspec):
-		 self.commondataspec = commondataspec
-		 self.fkspec = fkspec
-		 self.poslambda = poslambda
-		 self.thspec = thspec
+  class PositivitySetSpec():
+      def __init__(self, commondataspec, fkspec, poslambda, thspec):
+        self.commondataspec = commondataspec
+        self.fkspec = fkspec
+        self.poslambda = poslambda
+        self.thspec = thspec
 
-	     @property
-	     def name(self):
-		 return self.commondataspec.name
+      @property
+      def name(self):
+        return self.commondataspec.name
 
-	     def __str__(self):
-		 return self.name
+      def __str__(self):
+        return self.nam
 
-	     @functools.lru_cache()
-	     def load(self):
-		 cd = self.commondataspec.load()
-		 fk = self.fkspec.load()
-		 return PositivitySet(cd, fk, self.poslambda)
 
-Here `PositivitySet` is the `libnnpdf` object. It is generally better
+This contains all necessary information for `validphys` to be able to load
+the relevant `fktable`. It is generally better
 to pass around the spec objects because they are lighter and have more
 information (e.g. the theory in the above example).
 
@@ -227,9 +222,7 @@ Computing PDF-dependent quantities
 ----------------------------------
 
 Now that we can receive positivity sets as input, let's do something
-with them. The SWIG wrappers allow us to call the C++ methods of
-`libnnpdf` from Python. These things go in the `validphys.results`
-module. We can start by defining a class to produce and hold the
+with them. We can start by defining a class to produce and hold the
 results:
 
 .. code:: python
@@ -255,7 +248,7 @@ way it allows to abstract away the different error types. One
 constructs an object inheriting from `validphys.core.Stats` that is
 appropriate for a given error type by calling `pdf.stats_class(data)`,
 where data is an array where the entries along the first dimension are
-the results from each member computed from `libnnpdf` (and the other
+the results from each member (and the other
 dimensions are arbitrary). `Stats` has methods that appropriately
 collapse along the first axis. For example, `central_value` computes
 the mean along the first axis for Monte Carlo PDFs and yields the

diff --git a/doc/sphinx/source/vp/dataspecification.rst b/doc/sphinx/source/vp/dataspecification.rst
@@ -68,8 +68,8 @@ are ``dataset_input``, ``cuts`` and ``theoryid``.
 
     It seems odd to require theory settings such as a ``theoryid`` in the
     ``dataset_input`` in order to load data. However, this is a relic of the
-    underlying C++ code that performs the loading of data, which intrinsically
-    groups together the commondata (CSVs containing data central values and
+    legacy C++ code that performs the loading of data, which intrinsically
+    grouped together the commondata (CSVs containing data central values and
     uncertainties) and :ref:`fktables`.
 
     Clearly there is a big margin for error when manually entering
@@ -86,15 +86,14 @@ The ``DataSetSpec`` contains all of the information used to construct it, e.g.
     >>> ds_spec.name
     'CMSZDIFF12'
 
-but also importantly has a ``load`` method, which returns an instance of the
-``DataSet`` that is generated from the C++ code using SWIG. This new object
-contains numpy arrays of data central values and experimental covariance
+but also importantly has a ``load_commondata`` method, which returns an instance of the
+``CommonData``. This new object contains numpy arrays of data central values and experimental covariance
 matrices, e.g:
 
 .. code:: python
 
-    >>> ds_libnnpdf = ds_spec.load()
-    >>> ds_libnnpdf.get_cv() # get central values of dataset
+    >>> cd = ds_spec.load_commondata()
+    >>> cd.get_cv() # get central values of dataset
     array([2917.  , 1074.  ,  460.5 ,  222.6 ,  109.8 ,   61.84,   30.19,
            2863.  , 1047.  ,  446.1 ,  214.5 ,  110.  ,   58.13,   29.85,
            2588.  ,  935.5 ,  416.3 ,  199.  ,  103.1 ,   54.06,   28.45,

diff --git a/doc/sphinx/source/vp/datthcomp.md b/doc/sphinx/source/vp/datthcomp.md
@@ -27,8 +27,8 @@ such they are assumed to be correct, so in principle they have no
 guarantee of failing early with a good error message. However, you can
 set `check_plotting: True` in the input configurations to cause the
 PLOTTING files to be processed as soon as the dataset is loaded. This
-can be useful while debugging the plotting files, but will cause
-a noticeable delay to the startup (because the C++ DataSet objects
-need to be loaded in memory). This will warn the user of missing plotting files
+can be useful while debugging the plotting files, but might cause
+a noticeable delay to the startup (due to loading datasets and fktables).
+This will warn the user of missing plotting files
 and produce nice early error messages if the configuration is not
 processed correctly.
diff --git a/doc/sphinx/source/vp/developer.rst b/doc/sphinx/source/vp/developer.rst
@@ -22,9 +22,7 @@ Some of the most important modules are
 
 - `validphys.core`
 Core data structures that represent objects such as PDFs and data
-sets. Several of them map to `libnnpdf` objects. In that case they
-have a `.load()` method that produces the corresponding `C++`
-object.
+sets. 
 
 - `validphys.loader`
 Tools to obtain NNPDF resources locally or remotely. See :ref:`upload`
@@ -40,8 +38,8 @@ theory predictions.
 
 - `validphys.gridvalues`, `validphys.bases`, `validphys.pdfgrids`
 These contain tools to evaluate PDFs over grids of points.
-`validphys.gridvalues` contains low level functionality that uses
-`libnnpdf`, `validphys.pdfbases` contain several different bases
+`validphys.gridvalues` contains low level functionality that might use
+`lhapdf`, `validphys.pdfbases` contain several different bases
 over PDF flavour space and functionality to manipulate them, and
 `validphys.pdfgrids` contains high level providers suitable for
 using for plotting and as an input to other computations.

diff --git a/doc/sphinx/source/vp/index.rst b/doc/sphinx/source/vp/index.rst
@@ -27,9 +27,6 @@ Introduction to ``validphys 2``
   ``validphys`` can be found in the
   :ref:`Design <design>` section.
 
-* Some parts of ``validphys`` use the ``libnnpdf`` library in C++, through SWIG
-  wrappers.
-
 * The ideas behind the design of the code are explained in the
   :ref:`Design <design>` section.
 

diff --git a/doc/sphinx/source/vp/pydataobjs.rst b/doc/sphinx/source/vp/pydataobjs.rst
@@ -3,14 +3,9 @@
 Python based data objects
 =========================
 
-Internal data formats such as PDF sets, CommonData, or :ref:`FKTables
-<fktables>` files are currently accessed through the `libnnpdf` C++ code
-(interfaced trough the SWIG wrappers). However there is a :ref:`project
-<https://github.com/NNPDF/nnpdf/issues?q=label%3Adestroyingc%2B%2B+>` underway
-to make these resources available in terms of standard Python containers
-(particularly numpy arrays and pandas dataframes). The objectives include
-simplifying the codebase, increasing the ease of use and enabling more advanced
-computation and storage strategies.
+Internal data formats such as CommonData, or :ref:`FKTables
+<fktables>` are internally always numpy arrays or pandas dataframes.
+PDF sets are a bit more complicated since they use ``lhapdf``.
 
 Loading FKTables
 ----------------
@@ -271,4 +266,4 @@ the gluon and the d-quark, at three values of ``x`` at ``Q=91.2``.
     pdf = API.pdf(pdf="NNPDF40_nnlo_as_01180")
     l_pdf = pdf.load()
     alpha_s = l_pdf.central_member.alphasQ(91.2)
-    results = l_pdf.grid_values([21,1], [0.1, 0.2, 0.3], [91.2])
+    results = l_pdf.grid_values([21,1], [0.1, 0.2, 0.3], [91.2])
diff --git a/validphys2/examples/mc_gen_report.md b/validphys2/examples/mc_gen_report.md
@@ -1,4 +1,7 @@
 %CHORUSNB 100 replicas
+Mean table
+----------
+{@art_data_mean_table@}
 Data replica histograms 
 -----------------------
 {@art_data_comparison@}
@@ -9,6 +12,3 @@ Residuals
 ---------
 {@art_data_residuals@}
 {@one_art_data_residuals@}
-Mean table
-----------
-{@art_data_mean_table@}
diff --git a/validphys2/src/validphys/app.py b/validphys2/src/validphys/app.py
@@ -131,9 +131,6 @@ def init(self):
             if self.args["loglevel"] <= logging.DEBUG:
                 cout = True
         if not cout:
-            import NNPDF
-
-            NNPDF.SetVerbosity(0)
             lhapdf.setVerbosity(0)
 
     @staticmethod
@@ -147,11 +144,11 @@ def upload_context(do_upload, output):
         return contextlib.ExitStack()
 
     def run(self):
-        if sys.version_info < (3, 6):
+        if sys.version_info < (3, 9):
             log.warning(
-                "validphys 2 is discontinued on Python<3.6 and will "
+                "validphys 2 is discontinued on Python<3.9 and will "
                 "not be longer updated. Please run\n"
-                "conda install python=3.6\n\n"
+                "conda install python=3.9\n\n"
                 "If you have any problems, please open an issue "
                 "on https://github.com/NNPDF/nnpdf/issues."
             )

diff --git a/validphys2/src/validphys/calcutils.py b/validphys2/src/validphys/calcutils.py
@@ -59,8 +59,7 @@ def calc_chi2(sqrtcov, diffs):
     """
     #Note la.cho_solve doesn't really improve things here
     #NOTE: Do not enable check_finite. The upper triangular part is not
-    #guaranteed to make any sense. If this causes a problem, it is a bug in
-    #libnnpdf.
+    #guaranteed to make any sense.
     vec = la.solve_triangular(sqrtcov, diffs, lower=True, check_finite=False)
     #This sums up the result for the chi² for any input shape.
     #Sum the squares over the first dimension and leave the others alone

diff --git a/validphys2/src/validphys/closuretest/multiclosure_pseudodata.py b/validphys2/src/validphys/closuretest/multiclosure_pseudodata.py
@@ -22,10 +22,8 @@
 @check_use_fitcommondata
 def fits_dataset_cvs(fits_dataset):
     """Internal function for loading the level one data for all fits
-    for a single dataset. This function avoids using the c++ loading of
-    commondata which is very slow and also avoids the stringent metadata
+    for a single dataset. This function avoids the stringent metadata
     checks of the newer python commondata parser.
-
     """
     fits_cv = []
     for ds in fits_dataset:

diff --git a/validphys2/src/validphys/commondataparser.py b/validphys2/src/validphys/commondataparser.py
@@ -1,8 +1,6 @@
 """
 This module implements parsers for commondata  and systype files into useful
-datastructures, contained in the :py:mod:`validphys.coredata` module, which are
-not backed by C++ managed memory, and so they can be easily pickled and
-interfaced with common Python libraries. 
+datastructures, contained in the :py:mod:`validphys.coredata` module.
 
 The validphys commondata structure is an instance of :py:class:`validphys.coredata.CommonData`
 """
@@ -158,7 +156,7 @@ def get_plot_kinlabels(commondata):
 
 def get_kinlabel_key(process_label):
     """
-    Since there is no 1:1 correspondence between latex keys and GetProc,
+    Since there is no 1:1 correspondence between latex keys and the old libNNPDF names
     we match the longest key such that the proc label starts with it.
     """
     l = process_label

diff --git a/validphys2/src/validphys/config.py b/validphys2/src/validphys/config.py
@@ -433,7 +433,6 @@ def parse_dataset_input(self, dataset: Mapping):
             raise ConfigError(f"'weight' must be a number, not '{weight}'")
         if weight < 0:
             raise ConfigError(f"'weight' must be greater than zero not '{weight}'")
-        # Value needs to be string to not break libnnpdf Experiment
         custom_group = str(dataset.get("custom_group", "unset"))
         kdiff = dataset.keys() - known_keys
         for k in kdiff: