ArgusFlow

Argus Title-to-Data AI

Instantly transform messy product titles into structured, multi-language JSON objects using a localized Large Language Model.

bash — argus-generalizer
~
Response (200 OK):
{
  "brand": "Samsung",
  "series": "Galaxy S23 Ultra",
  "storage": "512GB",
  "color": "Phantom Black",
  "connectivity": "5G",
  "category": "smartphone"
}
Process completed in 42ms
~

Bring Order to Data Chaos

Data from suppliers or external sources is often a jumbled mess of unstructured text. The Generalizer uses a local AI to "understand" product strings. Input a raw title like "Solid oak dining table 200x100 cm natural finish" and receive back perfectly formatted fields for material, type, dimensions, and more.

Why This Beats Standard AI

  • Guaranteed JSON Structure: Most LLMs return inconsistent results. We use strict GBNF grammars to ensure the AI physically cannot output invalid JSON. It always fits your schema.
  • Zero External Costs: No expensive OpenAI API bills. The service runs entirely on your own hardware via llama-cpp-python.
  • Privacy First: Because the AI is local, your proprietary product data never leaves your infrastructure.
  • Multi-Language Native: Dynamically loads localized prompts, few-shot examples, and categories for high-accuracy extraction in any language.

API Usage

After running `make up-dev` from the `services/generalizer/` directory, the service is available on port 8003. Send a POST request with the product title you want to structure.

API Request Example


curl -X POST "http://localhost:8003/api/v1/generalize" \
-H "Content-Type: application/json" \
-H "x-api-key: default_dev_key" \
-d '{
  "language": "en",
  "title": "Solid oak dining table 200x100 cm natural finish"
}'

$ch = curl_init('http://localhost:8003/api/v1/generalize');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
    'x-api-key: default_dev_key'
]);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode([
    'language' => 'en',
    'title' => 'Solid oak dining table 200x100 cm natural finish'
]));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
$data = json_decode($response, true);

import requests

response = requests.post(
    "http://localhost:8003/api/v1/generalize",
    headers={"x-api-key": "default_dev_key"},
    json={"language": "en", "title": "Solid oak dining table 200x100 cm natural finish"}
)
print(response.json())

fetch('http://localhost:8003/api/v1/generalize', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'default_dev_key'
  },
  body: JSON.stringify({
    language: 'en',
    title: 'Solid oak dining table 200x100 cm natural finish'
  })
})
.then(res => res.json())
.then(data => console.log(data));

Clean, Structured Output

{
  "extracted_data": {
    "material": "oak",
    "type": "dining table",
    "dimensions": "200x100 cm",
    "finish": "natural",
    "category": "furniture"
  },
  "process_time": 1.25
}

Multi-language Support

The Generalizer is built to be multi-lingual from the ground up. Each language lives in its own directory (e.g., `app/prompts/en/`). When you make a request, the service dynamically loads the correct prompt, few-shot examples, and categories for that specific language.

How to Add a New Language

  1. Create a new directory: `mkdir -p app/prompts/fr`
  2. Add translated files: `prompt.py`, `examples.py`, and `categories.yml`.
  3. Generate the new grammar: `make generate-grammars`.
  4. Restart the service: `make restart`.