Skip to content

cobacious/neus

Repository files navigation

Neus

Live at neus.news

An AI-powered news aggregation platform that cuts through media bias by presenting neutral, clustered news stories.

Neus Desktop

What is Neus?

Neus is a personal prototype exploring AI-powered news clustering. It aims to reduce polarisation and combat misinformation by rethinking how news is presented, summarised, and sourced.

Instead of sensationalized headlines and algorithmic rage-feeds, Neus groups related articles from different publications into single story cards, each featuring:

  • AI-generated neutral headlines and summaries - abstracting away ideological slants
  • Source icons linking to original articles - maintaining transparency and attribution
  • Semantic clustering - grouping coverage of the same story across the political spectrum

How It Works

  1. Ingestion: RSS feeds from multiple UK news sources are crawled regularly
  2. Embedding: Article content is converted to vector embeddings using OpenAI's API
  3. Clustering: Stories are grouped using cosine similarity and graph-based connected components
  4. Summarization: GPT generates neutral headlines and summaries for each cluster
  5. Ranking: Clusters are scored based on recency, source coverage, and source trust

Tech Stack

  • Frontend: React + TypeScript + Vite
  • Backend API: GraphQL (GraphQL Yoga)
  • Engine: Node.js pipeline for ingestion, embedding, and clustering
  • Database: PostgreSQL (hosted on Supabase)
  • AI: OpenAI embeddings (text-embedding-3-small) + GPT-4o-mini for summarization
  • Deployment: Vercel (frontend) + Railway (API)
  • Monorepo: pnpm workspaces

Architecture

News Sources (RSS) 
    ↓
Engine (clustering/summarization) 
    ↓
PostgreSQL Database 
    ↓
GraphQL API 
    ↓
React Web App

Key Features

Semantic Clustering

Uses OpenAI embeddings and cosine similarity (threshold: 0.85) to identify related articles. Implements graph-based clustering via depth-first search to find connected components.

Neutral Summarization

Prompt-engineered GPT queries generate bias-neutral headlines and summaries, with structured JSON outputs and fallback regex parsing for reliability.

Cost Controls

  • Token usage limits (TOKEN_LIMIT env var)
  • Content truncation for embeddings (8192 chars max)
  • Configurable model selection (gpt-4o-mini for summarization)
  • Manual pipeline execution (no runaway costs)

Smart Deduplication

Uses Jaccard similarity to detect and filter near-duplicate clusters, preventing redundant story cards.

Local Development

Prerequisites

  • Node.js 18+
  • pnpm 8+
  • PostgreSQL (or Supabase account)
  • OpenAI API key

Setup

  1. Clone and install dependencies
git clone https://github.com/cobacious/neus.git
cd neus
pnpm install
  1. Configure database
# Set up your DATABASE_URL in packages/db/.env
cp packages/db/.env.example packages/db/.env

# Run migrations
pnpm db:migrate
  1. Configure engine
cp apps/engine/.env.example apps/engine/.env
# Add your OPENAI_API_KEY and other settings
  1. Configure API
cp apps/api/.env.example apps/api/.env
# Add DATABASE_URL
  1. Configure web app
cp apps/web/.env.example apps/web/.env
# Set VITE_API_URL (default: http://localhost:4000/graphql)

Running Locally

# Terminal 1: Start API
pnpm --filter @neus/api dev

# Terminal 2: Start web app  
pnpm --filter @neus/web dev

# Terminal 3: Run pipeline (one-time)
pnpm pipeline

Visit http://localhost:5173 (or whatever port Vite assigns)

Deployment

  • Frontend: Deployed to Vercel from apps/web
  • API: Deployed to Railway from apps/api
  • Database: Hosted on Supabase (free tier)
  • Pipeline: Run manually as needed (cost-controlled)

The pipeline is executed locally or via CI when data refresh is needed, typically 1-2 times per week.

Project Structure

neus/
├── apps/
│   ├── api/          # GraphQL API server
│   ├── web/          # React frontend
│   ├── engine/       # Ingestion & clustering pipeline
│   └── admin/        # Admin UI for feed management
├── packages/
│   └── db/           # Prisma schema & database client
└── screenshots/      # App screenshots

Scoring Algorithm

Clusters are ranked using:

score = recency * 0.4 + coverage * 0.3 + trust * 0.3

Where:

  • Recency: How recent the articles were published
  • Coverage: Number of distinct sources covering the story
  • Trust: Average trust score of sources (manually curated)

No engagement metrics are considered - ranking is purely editorial.

Screenshots

Desktop View

Neus Desktop Interface

Mobile View

Neus Mobile Interface

Current Status

Neus is a working prototype demonstrating AI-powered news clustering and neutral summarization. Data is refreshed manually via the pipeline. Expect limited sources, occasional bugs, and rough edges.

This is a personal project exploring:

  • Practical applications of embeddings and semantic search
  • LLM prompt engineering for bias reduction
  • Production-grade AI pipeline architecture
  • Cost-controlled AI feature development

Future Possibilities

  • Article heatmaps highlighting facts vs. opinions
  • Inline fact-checking and cross-source validation
  • "What's Missing" - perspectives omitted from coverage
  • Browser extension for on-demand article clustering
  • Expanded source coverage across political spectrum

License

Copyright © 2024. All rights reserved.

This code is available for viewing and reference purposes. For any other use, please contact the repository owner.


Note: Neus is not intended as a commercial product. It's an exploration of using AI to combat information polarization and improve news consumption.

About

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors