From a09a35fa9e9e7505b74f5dfbf189547faf40e033 Mon Sep 17 00:00:00 2001 From: Dennis Priskorn Date: Fri, 23 Jun 2023 17:40:56 +0200 Subject: [PATCH 1/3] docs: Improve the README.md --- README.md | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 6ec5943..abd0753 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,68 @@ # Entityshape A python library to compare a wikidata item with an entityschema -Based on https://github.com/Teester/entityshape by Mark Tully +Based on https://github.com/Teester/entityshape by Mark Tully +and https://github.com/dpriskorn/PyEntityshape by Dennis Priskorn + +# Features +* compare a given wikidata item with an entityschema and dig into missing properties, too many statement, etc. +* determine whether an item is valid according to a certain schema or not + +# Installation +Get it from pypi + +`$ pip install pyentityshape` + +# Usage +Example: +``` +e = EntityShape(eid="E1", lang="en", qid="Q1") +result = e.get_result() +result.is_valid +False|True +result.required_properties_that_are_missing +{"P1", "P2"} +``` + +## Validation +The is_valid method on the Result object mimics all red warnings displayed by https://www.wikidata.org/wiki/User:Teester/EntityShape.js + +It currently checks these five conditions that all have to be false for the item to be valid: +1. properties with too many statements found +2. incorrect statements found +3. some required properties are missing +4. properties without enough correct statements found +5. statements with properties that are not allowed found + +# Background +This library is the glue between libraries like Wikibase +Integrator and entityschemas. + +It makes it easy to batch check a whole subset of Wikidata +items against a schema. Nice! + +# TODO +The CompareShape and Shape classes should be rewritten using OOP +and enums to avoid passing strings around because that is not +nice to debug or maintain. + +What do we want to know from the CompareShape class? + +On the property level: +* whether the property is mandatory and present/missing + +On the statement level +* whether the cardinality of values is allowed (min/max) +* whether the value(s) are correct/incorrect + +Cases: +* mandatory property is missing +* optional property is missing (this is not invalidating) +* a property has an incorrect value +* a property has a correct value +* a property has too many values +* a property has not enough values +* ? + +# License +GPLv3+ \ No newline at end of file From 1ba1e20a355ef068bbd7b74f20675c5a3c44ab92 Mon Sep 17 00:00:00 2001 From: Dennis Priskorn Date: Fri, 23 Jun 2023 17:43:51 +0200 Subject: [PATCH 2/3] docs: Improve the README.md --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index abd0753..4893c47 100644 --- a/README.md +++ b/README.md @@ -64,5 +64,9 @@ Cases: * a property has not enough values * ? +# ShEx Tip +When working on your Entity Schemas the constraints here are nice to know/remember +https://shex.io/shex-primer/#tripleConstraints + # License GPLv3+ \ No newline at end of file From 024a6db25c26fc081f21a3a645095728ad231064 Mon Sep 17 00:00:00 2001 From: Dennis Priskorn Date: Fri, 23 Jun 2023 17:50:04 +0200 Subject: [PATCH 3/3] docs: Improve the README.md and remove unused files --- .pylintrc | 2 -- README.md | 16 +++++++++++++--- sonar-project.properties | 11 ----------- 3 files changed, 13 insertions(+), 16 deletions(-) delete mode 100644 .pylintrc delete mode 100644 sonar-project.properties diff --git a/.pylintrc b/.pylintrc deleted file mode 100644 index c6d73d8..0000000 --- a/.pylintrc +++ /dev/null @@ -1,2 +0,0 @@ -[MASTER] -init-hook="from pylint.config import find_pylintrc; import os, sys; sys.path.append(os.path.dirname(find_pylintrc()))" diff --git a/README.md b/README.md index 4893c47..a33701b 100644 --- a/README.md +++ b/README.md @@ -35,8 +35,8 @@ It currently checks these five conditions that all have to be false for the item 5. statements with properties that are not allowed found # Background -This library is the glue between libraries like Wikibase -Integrator and entityschemas. +This library is the glue between libraries like [Wikibase +Integrator](https://github.com/LeMyst/WikibaseIntegrator/) and entityschemas. It makes it easy to batch check a whole subset of Wikidata items against a schema. Nice! @@ -68,5 +68,15 @@ Cases: When working on your Entity Schemas the constraints here are nice to know/remember https://shex.io/shex-primer/#tripleConstraints +# Thanks +Big thanks to [Myst](https://github.com/LeMyst) and +[Christian Clauss](https://github.com/cclauss) for +advice and help with Ruff to make this better. + # License -GPLv3+ \ No newline at end of file +GPLv3+ + +# What I learned +* Forking other peoples undocumented spaghetti code is not much fun. +* Pydantic is wonderful yet again it makes working with OOP easy peasy :) +* Ruff is crazy fast and very nice! \ No newline at end of file diff --git a/sonar-project.properties b/sonar-project.properties deleted file mode 100644 index 1586f08..0000000 --- a/sonar-project.properties +++ /dev/null @@ -1,11 +0,0 @@ -sonar.organization=teester -sonar.projectKey=Teester_entityshape - -# relative paths to source directories. More details and properties are described -# in https://sonarcloud.io/documentation/project-administration/narrowing-the-focus/ -sonar.python.version=3 -sonar.sources=. -sonar.dynamicAnalysis=reuseReports -sonar.core.codeCoveragePlugin=cobertura -sonar.python.coverage.reportPaths=*coverage*.xml -sonar.python.xunit.reportPath=xunit-result*.xml