Skip to content

Meltano Extractor for the WordPress.org API, using the Singer SDK

License

Notifications You must be signed in to change notification settings

Automattic/tap-wordpress-org

Repository files navigation

tap-wordpress-org

Tests License Python

Singer tap for WordPress.org, built with the Meltano SDK for Singer Taps.

Installation

Install from PyPI:

pipx install tap-wordpress-org

Install from source:

git clone https://github.com/your-org/tap-wordpress-org.git
cd tap-wordpress-org
pip install .

Configuration

Accepted Config Options

A full list of supported settings and capabilities for this tap is available by running:

tap-wordpress-org --about

Configure using environment variables

This Singer tap will automatically import any environment variables within the working directory's .env if the --config=ENV is provided, such that config keys will be namespaced using the format TAP_WORDPRESS_ORG_{CONFIG_KEY}. For example:

export TAP_WORDPRESS_ORG_API_URL=https://api.wordpress.org
export TAP_WORDPRESS_ORG_USER_AGENT=my-app/1.0

Configuration options

Setting Required Default Description
api_url False https://api.wordpress.org The URL for the WordPress.org API
user_agent False tap-wordpress-org/0.1.0 User agent for API requests
events_location False None Location for events search (e.g., 'Seattle, WA')
events_ip False None IP address for events location detection
stream_selection False All streams List of stream names to sync (e.g., ["plugins", "wordpress_stats"])
start_date False None Start date for incremental replication (plugins/themes only)

Capabilities

  • catalog
  • state
  • discover
  • about
  • stream-maps
  • schema-flattening

Supported Python Versions

  • 3.9
  • 3.10
  • 3.11
  • 3.12

Streams

Stream Primary Key Replication Method Notes
plugins slug INCREMENTAL WordPress plugin repository data
themes slug INCREMENTAL WordPress theme repository data
events id FULL_TABLE WordPress events (WordCamps and meetups)
patterns id FULL_TABLE Block patterns
wordpress_stats version FULL_TABLE WordPress version usage statistics
php_stats version FULL_TABLE PHP version usage statistics
mysql_stats version FULL_TABLE MySQL version usage statistics
locale_stats locale FULL_TABLE Language/locale usage statistics

Features

Incremental Replication

The plugins and themes streams support incremental replication using the last_updated field. Set a start_date in your configuration to sync only records updated after that date.

Stream Selection

You can select specific streams to sync by setting stream_selection in your configuration:

{
  "stream_selection": ["plugins", "wordpress_stats", "php_stats"]
}

Custom Transformations

The tap includes built-in data transformations:

  • HTML entity decoding (e.g., –)
  • Boolean field normalization (converts false to null for optional fields)

Usage

You can easily run tap-wordpress-org by itself or in a pipeline using Meltano.

Executing the Tap Directly

tap-wordpress-org --version
tap-wordpress-org --help
tap-wordpress-org --config CONFIG --discover > ./catalog.json

Developer Resources

Follow these instructions to contribute to this project.

Initialize your Development Environment

pipx install poetry
poetry install

Create and Run Tests

Create tests within the tests subfolder and then run:

poetry run pytest

You can also test the tap-wordpress-org CLI interface directly using poetry run:

poetry run tap-wordpress-org --help

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-wordpress-org
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-wordpress-org --version
# OR run a test `elt` pipeline:
meltano elt tap-wordpress-org target-jsonl

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

About

Meltano Extractor for the WordPress.org API, using the Singer SDK

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages