Argus LLM Parser
A standalone microservice that uses a local AI model to convert complex HTML snippets into structured, grammar-guaranteed JSON.
What is the LLM Parser?
Argus LLM Parser is a microservice designed to parse HTML snippets into structured JSON data using a local Large Language Model (LLM). It leverages a JSON schema to generate a strict GBNF grammar, which ensures that the LLM's output is always valid and conforms to the desired structure.
Key Features
Strict Output Guarantee:
Employs a GBNF grammar generated from a JSON schema to ensure valid, structured JSON output.
Local & Private:
Uses `llama-cpp-python` for efficient inference on your own hardware. No data is sent to external APIs.
Secure by Default:
All API endpoints (except `/health`) are protected by a mandatory `x-api-key` header.
Fully Dockerized:
Includes a multi-stage Dockerfile and `docker-compose.yml` for fast, reproducible setup.
Quick Start & API Usage
To run the service, execute all commands from the `services/llm_parser/` directory:
- Copy the configuration file: `cp .env.example .env` (if it exists).
- Run the one-time setup: `make setup`. This downloads the model and generates the grammar file.
- Start the development service: `make up-dev`.
The service will be available at `http://localhost:8002` with interactive API docs at `http://localhost:8002/docs`.
API Request Example
The `/api/v1/parse` endpoint requires an API key. The default key for development is `default_dev_key`.
curl -X POST "http://localhost:8002/api/v1/parse" \
-H "Content-Type: application/json" \
-H "x-api-key: default_dev_key" \
-d '{
"html_snippet": "<div><h1>Specification</h1><div><div>Size</div></div><div>xxl</div></div>"
}'
Example Output
{
"category": "Specification",
"details": {
"Size": "xxl"
}
}
Security
All API endpoints (except for `/health`) require a valid API key to be passed in the `x-api-key` header.
The default development key is `default_dev_key`. For production, you must override this by setting the `AUTH__API_KEY` environment variable to a strong, randomly generated key.