Skip to content

Tna#7

Merged
omgoo merged 22 commits intofeature/Queryfrom
tna
Apr 28, 2021
Merged

Tna#7
omgoo merged 22 commits intofeature/Queryfrom
tna

Conversation

@omgoo
Copy link
Copy Markdown
Member

@omgoo omgoo commented Apr 28, 2021

No description provided.

omerijaz27 and others added 22 commits April 21, 2021 17:16
update to latest wombat (3.1.4)
* Pass collection name to ACL checker to load ACL lists
for automatic collections

* Typo: file suffix must be `.aclj`
…ixes webrecorder#628 (webrecorder#629)

* Add unit test to verify whether ACL exact-match rules in a single-line
*.aclj file are found

* Fix AccessChecker to match exact rules in a single-line rule file
…brecorder#623)

- add unit test to verify unknown output formats are handled
  if output fields param is in request
* FrontendApp: forward HTTP status of CDX backend to allow clients
to handle errors more easily

* WarcServer: keep the HTTP status lines short
- append the exception message only if the status isn't a string
  (WbException and inherited classes already have nice status string)
- avoid overlong status lines, eg.
   HTTP/1.1 404 Not Found No Captures found for: https://very-long.url/...
…r#626)

* FrontendApp: forward HTTP status of CDX backend to allow clients
to handle errors more easily

* Handle CDXExceptions properly, returning the exception status code
- make that CDXException is raised early so that it can be handled
  in the IndexHandler
The 'dedup_index_url' configuration option should be inside the
'recorder' section.
…er#631)

- do not apply any filters (param filter, from, to, closest)
  if counting pages (param showNumPages=true)
The field is unfortunately misnamed compressedendoffset in XML but OWB
actually uses this for the compressed length 'S' CDX field.

Without this field when WARC files are accessed over HTTP pywb will make
open byte range requests which results in a lot more data being read
from disk than necessary.
…order#634)

This advertises the Python support that is already in place.
* post append improvements:
- parse json primitives for post query
- for text/plain, attempt to parse as json, then as binary
- standardize post append indexing
- include '__wb_method' in urlkey
- add 'requestBody' and 'method' to cdxj
- support unique dupe params for json-to-query conversion

* test fixes:
- update tests for test_inputreq,
- update post-test.cdxj and post-test.cdx

* ci: fixes
- tox: run full test suite!
- disable appveyor

* inputrequest buffering fix:
- never truncate reading POST request, must read entire POST data to avoid hung request in live mode
- truncate final query string to 4096
APP-93 Added nobanner support.
APP-92 Fixed the access_checker.py, warcserver.py
@omgoo omgoo requested a review from omerijaz27 April 28, 2021 13:45
@omgoo omgoo merged commit 93c3d16 into feature/Query Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants