Skip to content

BruceDLong/Proteus

Repository files navigation

Proteus

Proteus is an information-theoretic programming language and inference engine built around the concept of infons -- structured information units that can represent numbers, strings, lists, and complex nested data. Proteus aims to bridge programming and natural language by combining a formal data model with natural language understanding capabilities.

Why Proteus?

Most programming languages treat data and natural language as fundamentally separate concerns. Proteus takes a different approach: its core data structure -- the infon -- ties formal types directly to vocabulary, so a data schema is also a semantic definition. A field named color isn't just a label; it carries the linguistic meaning and type constraints of the English word "color."

This matters when you need:

  • Executable specifications -- Business rules, compliance policies, or domain models written in near-English that a machine can also reason about. No separate "requirements doc" that drifts from the implementation.
  • Knowledge bases with built-in inference -- Define facts, constraints, and relationships as infons. The agenda-driven engine normalizes and merges them automatically, resolving ambiguities without hand-coded logic.
  • Semantic data modeling -- Schema definitions where types are grounded in vocabulary, not arbitrary strings. Type-checking catches semantic mismatches, not just structural ones.
  • Natural language interfaces -- Parse English input into typed infon structures using the built-in translator, then run inference over the result.

Proteus is not a general-purpose application language. It is a knowledge representation and inference engine designed for problems where the gap between human meaning and machine processing is the core challenge.

Use Cases

Knowledge Engineering -- Build ontologies where facts are queryable and composable. The Resources/bike.pr example defines a bicycle as a composition of typed parts (frame, seat, chain), each with inherited properties from a base thing type. The engine can infer relationships and validate consistency across the model.

Domain-Specific Language Definition -- The Resources/toyLang.pr example defines a grammar (identifiers, comparison operators, loops, statements) entirely in Proteus syntax. The inference engine handles parsing and type-checking for the defined language, making Proteus a meta-tool for building other languages.

Natural Language Processing Research -- The English translator includes thousands of inflection rules (plurals, verb forms, possessives, irregular cases). Researchers can test theories about how formal semantics and natural language interact using a system that treats both as first-class concerns.

Business Rules and Compliance -- Encode rules in a form that is both human-auditable and machine-executable. Because infons preserve linguistic meaning alongside formal structure, rule bases can be reviewed by domain experts who don't need to read code.

Key Features

  • Infon-based data model -- Programs are expressed as structured information ("infons") that support numbers, strings, lists, typed fields, and nested structures.
  • Natural language integration -- Includes an English language translator (xlators/xlator_en.dog) for parsing and processing natural language constructs.
  • Agenda-based inference engine -- Resolves relationships between infons through an agenda-driven normalization and merging process.
  • Model and vocabulary management -- Define and look up typed words and their meanings via the built-in model manager.
  • Infon viewer -- A standalone viewer application for inspecting infon structures.

Prerequisites

Proteus source files are written in CodeDog (.dog files), which compiles to C++. To build Proteus you will need:

  • CodeDog -- the CodeDog compiler
  • GNU C++ toolchain -- GCC/G++ on Linux (the primary supported platform)
  • Python 3 -- for ruleMgr.py and related tooling

Building

The default build configuration targets Linux with the GNU C++ toolchain. From the project root:

codedog Proteus.Lib.dog

The build line in Proteus.Lib.dog is:

LinuxTestBuild: Platform='Linux' Lang='CPP' LangVersion='GNU' testMode='makeTests';

Note: Proteus.Lib.dog includes WorldManager.dog, which is not currently present in the repository. You may need to obtain this file from the maintainers or check whether it is generated by a companion tool before building.

Running Tests

Proteus includes a CodeDog test suite and a C++ test harness:

# Build the test executable via the LinuxTestBuild config
codedog Proteus.Lib.dog

# Run the generated test executable
./TestProteus

# Compile and run the C++ test harness separately
g++ -g -std=c++11 infonTest.cpp -o infonTest && ./infonTest

Project Structure

Path Description
Proteus.Lib.dog Main engine library and entry point
infonIO.dog Infon input/output, parsing, and serialization
infonList.dog Infon list data structures and operations
ModelManager.dog Model and vocabulary management
Functions.dog Built-in functions
debugSystems.dog Debugging and diagnostic systems
clip.dog Clipboard and utility operations
timeAccess.dog Time access utilities
infonViewer.dog Standalone infon viewer application
DB_workAround.dog Database v1 workarounds (string utilities)
testInflect.dog Inflection testing for the English translator
xlators/xlator_en.dog English language translator
ProteusTests.dog Test suite
ProteusDBServer.dog Database server component
ruleMgr.py Rule management (Python)
infonTest.cpp Infon C++ test harness
Examples/ Example Proteus programs
Resources/ Sample .pr files and a web interface (web/)
theory/ Experimental and theoretical work

Example

Here is a Tic-tac-toe game written in Proteus syntax (Examples/Tic-tac-toe.pr):

def tic_tac_toe: {

    def playerSymbol: ['X' | 'O']

    def slot: {T [' ' | 'X' | 'O'] | ...}

    def row: *3+{slot| ...}

    def board: *3+{row| ...}

    def move: {column:1..3, row:1..3}

    def player: {name  playerSymbol  moves:{T move| ...}}

    def turn: {player, move, board}

    def winner: [ 'X'  'O'  'Tie']

    *2 + { player |
            {%.name = userInput<:{prompt: "Player X, enter your name"  %.playerSymbol="X"}}
            {%.name = userInput<:{prompt: "Player O, enter your name"  %.playerSymbol="O"}}
    }
    def play: {
        turns: {T turn|
            {playerSymbol:'X' move:player.0.move.0  board:{ *3+{' '  ' '  ' '}|...}}

            #{  {
                playerSymbol= !playerSymbol
                move=player.playerSymbol.move = userInput<:{prompt: (%.name " please enter your move:")}
                board.(move.column).(move.row) = playerSymbol
                }
             | ...}

            {[ %turns.size==9  |  CheckWinningBoard<: board]}
        }
        winner: [{%turns.size=9 %='Tie'}  | turns.last.player]
    }
}

tic_tac_toe.play

Status

Proteus is under active development (version 0.8). Current work includes:

  • Streaming normalization (work in progress)
  • Syntax updates (withEach loop changes)
  • Thread synchronization fixes
  • Agenda ordering improvements

Known issue: Proteus.Lib.dog references WorldManager.dog via #include, but this file is not present in the repository.

Authors

  • Bruce Long
  • KT Lawrence

License

All Rights Reserved.

"This file is part of the "Proteus Language suite" All Rights Reserved."

Copyright (c) 2015-2023 Bruce Long

About

Proteus 2.0

Resources

Stars

Watchers

Forks

Sponsor this project

Packages

 
 
 

Contributors