A privacy‑focused web, social media, and archive search server exposing tools via the Model Control Protocol (MCP) for controlled access to external search capabilities.
- Features
- Installation & Quick Start
- Command‑Line Options
- Integration with LM Studio
- MCP API Overview
- Available Tools
- Security Features
- Parallel search across multiple privacy‑focused web engines.
- Social media lookup for public content on major platforms.
- Archive retrieval from Wayback Machine, archive.today, Google Cache and others.
- Dynamic listing of supported engines and archive services.
- Result caching with LRU eviction to speed up repeated queries.
# Clone the repository (if applicable)
git clone https://github.com/undici77/MCPWebSearch.git
cd MCPWebSearch
# Run the startup script (adjust name if different)
./run.sh -d /path/to/working/directory1️⃣ Create & activate a Python virtual environment (.venv).
2️⃣ Install all required dependencies from requirements.txt.
3️⃣ Launch the MCP Search Server (main.py) which listens on stdin/stdout for JSON‑RPC messages.
📌 Ensure the startup script is executable:
chmod +x run.sh
| Option | Description |
|---|---|
-d, --directory |
Path to the working directory (default: current process dir). |
The server itself does not require additional CLI flags; all configuration is performed via JSON‑RPC.
Add an entry to your mcp.json so LM Studio can start the server automatically:
{
"mcpServers": {
"web-search": {
"command": "/absolute/path/to/run.sh",
"args": [
"-d",
"/absolute/path/to/working/directory"
],
"env": { "WORKING_DIR": "." }
}
}
}📌 Make the script executable (
chmod +x /absolute/path/to/run.sh) and run./run.shonce to install the virtual environment before launching LM Studio.
All communication follows JSON‑RPC 2.0 over stdin/stdout.
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {}
}Response: protocol version (2024-11-05), server capabilities (tool enumeration) and basic server info (name, version).
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/list",
"params": {}
}Response: an array of tool definitions (name, description, input schema).
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "<tool_name>",
"arguments": { … }
}
}Note: The tool identifier key is name, not tool.
Search the web using multiple privacy‑focused engines in parallel.
| Name | Type | Required | Description |
|---|---|---|---|
query |
string | ✅ | Search query (max 500 characters). |
engine |
string | ❌ (default all) |
Engine to use (duckduckgo, brave, startpage, ecosia, mojeek, yandex or all). |
max_results |
integer | ❌ (default 20) | Max results per engine (1‑50). |
Example
{
"jsonrpc": "2.0",
"id": 10,
"method": "tools/call",
"params": {
"name": "web_search",
"arguments": {
"query": "privacy focused search engines",
"engine": "duckduckgo",
"max_results": 15
}
}
}The server returns a formatted text block containing titles, URLs and snippets from each selected engine.
Search public content on major social‑media platforms.
| Name | Type | Required | Description |
|---|---|---|---|
query |
string | ✅ | Search query (max 500 characters). |
platform |
string | ❌ (default all) |
Platform to search (twitter, reddit, youtube, github, stackoverflow, medium, pinterest, tiktok, instagram, facebook, linkedin or all). |
Example
{
"jsonrpc": "2.0",
"id": 11,
"method": "tools/call",
"params": {
"name": "social_search",
"arguments": {
"query": "AI ethics research",
"platform": "reddit"
}
}
}The response contains direct URLs that can be opened in a browser.
Find archived versions of a URL across multiple web‑archive services.
| Name | Type | Required | Description |
|---|---|---|---|
url |
string | ✅ | Complete URL (must include http:// or https://). |
service |
string | ❌ (default all) |
Archive service (wayback, archive_today, google_cache, bing_cache, yandex_cache, cachedview, ghostarchive or all). |
check_availability |
boolean | ❌ (default false) | When true, the server queries the Wayback Machine API for snapshot statistics. |
Example
{
"jsonrpc": "2.0",
"id": 12,
"method": "tools/call",
"params": {
"name": "archives_search",
"arguments": {
"url": "https://example.com",
"service": "wayback",
"check_availability": true
}
}
}The response lists archive URLs and, if requested, snapshot counts and timestamps.
List all available privacy‑focused search engines.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 13,
"method": "tools/call",
"params": {
"name": "list_engines",
"arguments": {}
}
}The server returns a markdown‑formatted overview of each engine and usage notes.
List all supported web‑archive services.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 14,
"method": "tools/call",
"params": {
"name": "list_archives_services",
"arguments": {}
}
}The response includes a description of each service, its ID and key features.
Clear the internal search‑result cache.
| Name | Type | Required | Description |
|---|---|---|---|
| (No parameters) | — | — | — |
Example
{
"jsonrpc": "2.0",
"id": 15,
"method": "tools/call",
"params": {
"name": "clear_cache",
"arguments": {}
}
}The server replies with a confirmation message.
- Query sanitisation – strips control characters, removes HTML tags and enforces
MAX_QUERY_LENGTH(500). - Strict URL validation – accepts only
http://orhttps://schemes with a valid domain. - Blocked patterns – regexes prevent
<script>injection,javascript:URIs and event‑handler attributes. - Input schema enforcement – each tool validates required fields via the JSON‑RPC
inputSchema. - Rate limiting – an asyncio semaphore caps concurrent external requests (
MAX_CONCURRENT_SEARCHES).
© 2025 Undici77 – All rights reserved.