ArgusFlow

Argus HTML-to-JSON AI

Stop fighting with brittle regex. Convert complex HTML snippets into structured, grammar-guaranteed JSON instantly.

bash — argus-llm-parser
~
Response (200 OK):
{
  "dimensions": "146.7 x 71.5 x 7.8 mm",
  "weight": "172 g",
  "display_type": "Super Retina XDR OLED",
  "resolution": "2532 x 1170 pixels"
}
Process completed in 42ms
~

Order from HTML Chaos

Extracting data from messy, nested HTML is a developer’s nightmare. Standard AI models often "hallucinate" or return broken code. Argus HTML-to-JSON AI solves this by using a local Large Language Model (LLM) constrained by strict GBNF grammar. It extracts exactly the data you need—guaranteed to be valid JSON.

Why This Beats Standard AI

  • Grammar-Locked Output: Most LLMs return inconsistent results. We use GBNF constraints to ensure the AI physically cannot output invalid JSON structure.
  • Private & Local: Built on `llama-cpp-python`. All parsing happens on your own hardware—no data ever leaves your server.
  • Schema Driven: Define your target structure using a JSON schema, and the AI will adapt its extraction to fit your needs perfectly.

Quick Start & API Usage

Run the service from the `services/llm_parser/` directory. The setup command automatically downloads the required AI model.

  1. Setup environment: `cp .env.example .env`
  2. Initialize AI model: `make setup`
  3. Start the service: `make up-dev`

The service is available at `http://localhost:8002`.

API Request Example


curl -X POST "http://localhost:8002/api/v1/parse" \
-H "Content-Type: application/json" \
-H "x-api-key: default_dev_key" \
-d '{
  "html_snippet": "<div><h1>Specification</h1><div><div>Size</div></div><div>xxl</div></div>"
}'

$ch = curl_init('http://localhost:8002/api/v1/parse');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
    'x-api-key: default_dev_key'
]);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode([
    'html_snippet' => '<div><h1>Specification</h1><div><div>Size</div></div><div>xxl</div></div>'
]));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
$data = json_decode($response, true);

import requests

response = requests.post(
    "http://localhost:8002/api/v1/parse",
    headers={"x-api-key": "default_dev_key"},
    json={"html_snippet": "<div><h1>Specification</h1><div><div>Size</div></div><div>xxl</div></div>"}
)
print(response.json())

fetch('http://localhost:8002/api/v1/parse', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'default_dev_key'
  },
  body: JSON.stringify({
    html_snippet: '<div><h1>Specification</h1><div><div>Size</div></div><div>xxl</div></div>'
  })
})
.then(res => res.json())
.then(data => console.log(data));

Guaranteed JSON Output

{
  "category": "Specification",
  "details": {
    "Size": "xxl"
  }
}

Security by Design

Every microservice in the Argus ecosystem is protected by mandatory API-key authentication. While `default_dev_key` is used for testing, you must set a cryptographically secure `AUTH__API_KEY` for production environments.