> ## Documentation Index
> Fetch the complete documentation index at: https://portkey-docs-feat-rerank-documentation.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Triton Inference Server

> Integrate Triton-hosted custom models with Portkey for production observability and reliability.

Portkey provides a robust platform to observe, govern, and manage your **locally** or **privately** hosted custom models using Triton Inference Server.

<Info>
  Here's the official [Triton Inference Server documentation](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html) for more details.
</Info>

## Integration Steps

<Steps>
  <Step title="Expose your Triton Server">
    Expose your Triton server using a tunneling service like [ngrok](https://ngrok.com/) or make it publicly accessible. Skip this if you're self-hosting the Gateway.

    ```sh theme={"system"}
    ngrok http 8000 --host-header="localhost:8080"
    ```
  </Step>

  <Step title="Add to Model Catalog">
    1. Go to [**Model Catalog → Add Provider**](https://app.portkey.ai/model-catalog/providers)
    2. Enable **"Local/Privately hosted provider"** toggle
    3. Select **Triton** as the provider type
    4. Enter your Triton server URL in **Custom Host**: `http://localhost:8000/v2/models/mymodel`
    5. Add authentication headers if needed
    6. Name your provider (e.g., `my-triton`)

    <Card title="Complete Setup Guide" icon="book" href="/product/model-catalog">
      See all setup options
    </Card>
  </Step>

  <Step title="Use in Your Application">
    <CodeGroup>
      ```python Python theme={"system"}
      from portkey_ai import Portkey

      portkey = Portkey(
          api_key="PORTKEY_API_KEY",
          provider="@my-triton"
      )

      response = portkey.chat.completions.create(
          model="your-model-name",
          messages=[{"role": "user", "content": "Hello!"}]
      )

      print(response.choices[0].message.content)
      ```

      ```javascript Node.js theme={"system"}
      import Portkey from 'portkey-ai';

      const portkey = new Portkey({
          apiKey: 'PORTKEY_API_KEY',
          provider: '@my-triton'
      });

      const response = await portkey.chat.completions.create({
          model: 'your-model-name',
          messages: [{ role: 'user', content: 'Hello!' }]
      });

      console.log(response.choices[0].message.content);
      ```
    </CodeGroup>

    **Or use custom host directly:**

    <CodeGroup>
      ```python Python theme={"system"}
      from portkey_ai import Portkey

      portkey = Portkey(
          api_key="PORTKEY_API_KEY",
          provider="triton",
          custom_host="http://localhost:8000/v2/models/mymodel",
          Authorization="AUTH_KEY"  # If needed
      )
      ```

      ```javascript Node.js theme={"system"}
      import Portkey from 'portkey-ai';

      const portkey = new Portkey({
          apiKey: 'PORTKEY_API_KEY',
          provider: 'triton',
          customHost: 'http://localhost:8000/v2/models/mymodel',
          Authorization: 'AUTH_KEY'  // If needed
      });
      ```
    </CodeGroup>
  </Step>
</Steps>

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Gateway Configs" icon="sliders" href="/product/ai-gateway">
    Add retries, timeouts, and fallbacks
  </Card>

  <Card title="Observability" icon="chart-line" href="/product/observability">
    Monitor your Triton deployments
  </Card>

  <Card title="Custom Host Guide" icon="server" href="/product/ai-gateway/universal-api#integrating-local-or-private-models">
    Learn more about custom host setup
  </Card>

  <Card title="BYOLLM Guide" icon="book" href="/integrations/llms/byollm">
    Complete guide for private LLMs
  </Card>
</CardGroup>

For complete SDK documentation:

<Card title="SDK Reference" icon="code" href="/api-reference/sdk/list">
  Complete Portkey SDK documentation
</Card>
