Cinclus LLM On-Demand is a cloud-based service providing on-demand access to Large Language Model (LLM) capabilities via a simple API. It allows developers to integrate advanced natural language processing into their applications without managing any infrastructure. With Cinclus, you can generate human-like text, hold conversations with AI, and create semantic text embeddings for search and analysis.
Key Features
- Multiple LLM Models: Choose from a range of models (various sizes and capabilities) to fit your needs, from fast smaller models to powerful large ones.
- Text and Chat Completions: Generate text completions for prompts or engage in multi-turn conversations with a chat-centric API.
- Embeddings: Create high-dimensional vector embeddings from text for semantic search, clustering, or recommendation systems.
- Regional Zones: Optionally select the data center zone (e.g., NO, SW, DK, FI) to process your requests for compliance or latency requirements.
- Scalable On-Demand: Automatically leverage shared capacity or dedicated instances to handle your workloads, with seamless scaling and failover across zones if needed.
- Secure API: Authentication via API keys, with encryption in transit (HTTPS) and robust monitoring and rate limiting.
Base URL and Versioning
The Cinclus LLM API is accessed through a base URL. All endpoints documented here are relative to the base URL:
Base URL: https://api.cinclus.wayscloud.services/v1/
This base URL includes the API version (v1). All requests must be made over HTTPS. Future updates may introduce new versions (v2, etc.) to ensure backward compatibility.
Quick Start
- Obtain an API Key: Sign up for a Cinclus account and retrieve your API key from the dashboard. See Authentication for details.
- Choose a Model: Decide which model suits your task. For example, use a general-purpose model for chat or a specialized model for embeddings.
-
Make a Request: Use the appropriate endpoint for your task (detailed in subsequent sections):
- For generating text given a prompt, use the Text Completions endpoint.
- For conversational AI, use the Chat Completions endpoint.
- For obtaining text embeddings, use the Embeddings endpoint.
- Include your API key in the request header for authentication (see Authentication).
- Handle the Response: Parse the JSON response from the API which will contain the model's output (the completion or embedding) and usage details.
- Monitor Usage: Track your usage and ensure you stay within rate limits. See Rate Limits & Usage for guidance.
Proceed to the next sections for detailed information on authentication, endpoint usage, examples, and advanced features like scaling and multi-zone offloading.