# ai-workspace-operator
**Repository Path**: mirrors_NVIDIA/ai-workspace-operator
## Basic Information
- **Project Name**: ai-workspace-operator
- **Description**: Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2022-10-24
- **Last Updated**: 2026-03-14
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# NVIDIA NIM Anywhere [](https://ngc.nvidia.com/open-ai-workbench/aHR0cHM6Ly9naXRodWIuY29tL05WSURJQS9uaW0tYW55d2hlcmUK)
[](https://docs.nvidia.com/nim/#large-language-models)
[](https://docs.nvidia.com/nim/#nemo-retriever)
[](https://docs.nvidia.com/nim/#nemo-retriever)
[](https://github.com/NVIDIA/nim-anywhere/actions/workflows/ci.yml?query=branch%3Amain)

Please join \#cdd-nim-anywhere slack channel if you are a internal user,
open an issue if you are external for any question and feedback.
One of the primary benefit of using AI for Enterprises is their ability
to work with and learn from their internal data. Retrieval-Augmented
Generation
([RAG](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/))
is one of the best ways to do so. NVIDIA has developed a set of
micro-services called [NIM
micro-service](https://docs.nvidia.com/nim/large-language-models/latest/introduction.html)
to help our partners and customers build effective RAG pipeline with
ease.
NIM Anywhere contains all the tooling required to start integrating NIMs
for RAG. It natively scales out to full-sized labs and up to production
environments. This is great news for building a RAG architecture and
easily adding NIMs as needed. If you're unfamiliar with RAG, it
dynamically retrieves relevant external information during inference
without modifying the model itself. Imagine you're the tech lead of a
company with a local database containing confidential, up-to-date
information. You don’t want OpenAI to access your data, but you need the
model to understand it to answer questions accurately. The solution is
to connect your language model to the database and feed them with the
information.
To learn more about why RAG is an excellent solution for boosting the
accuracy and reliability of your generative AI models, [read this
blog](https://developer.nvidia.com/blog/enhancing-rag-applications-with-nvidia-nim/).
Get started with NIM Anywhere now with the [quick-start](#quick-start)
instructions and build your first RAG application using NIMs!

- [Quick-start](#quick-start)
- [Generate your NGC Personal Key](#generate-your-ngc-personal-key)
- [Authenticate with Docker](#authenticate-with-docker)
- [Install AI Workbench](#install-ai-workbench)
- [Download this project](#download-this-project)
- [Configure this project](#configure-this-project)
- [Start This Project](#start-this-project)
- [Populating the Knowledge Base](#populating-the-knowledge-base)
- [Developing Your Own Applications](#developing-your-own-applications)
- [Application Configuration](#application-configuration)
- [Config from a file](#config-from-a-file)
- [Config from a custom file](#config-from-a-custom-file)
- [Config from env vars](#config-from-env-vars)
- [Chain Server config schema](#chain-server-config-schema)
- [Chat Frontend config schema](#chat-frontend-config-schema)
- [Contributing](#contributing)
- [Code Style](#code-style)
- [Updating the frontend](#updating-the-frontend)
- [Updating documentation](#updating-documentation)
- [Managing your Development
Environment](#managing-your-development-environment)
- [Environment Variables](#environment-variables)
- [Python Environment Packages](#python-environment-packages)
- [Operating System Configuration](#operating-system-configuration)
- [Updating Dependencies](#updating-dependencies)
# Quick-start
## Generate your NGC Personal Key
To allow AI Workbench to access NVIDIA’s cloud resources, you’ll need to
provide it with a Personal Key. These keys begin with `nvapi-`.
Expand this section for instructions for creating this key.
1. Go to the [NGC Personal Key
Manager](https://org.ngc.nvidia.com/setup/personal-keys). If you are
prompted to, then register for a new account and sign in.
> **HINT** You can find this tool by logging into
> [ngc.nvidia.com](https://ngc.nvidia.com), expanding your profile
> menu on the top right, selecting *Setup*, and then selecting
> *Generate Personal Key*.
2. Select *Generate Personal Key*.

3. Enter any value as the Key name, an expiration of 12 months is fine,
and select all the services. Press *Generate Personal Key* when you
are finished.

4. Save your personal key for later. Workbench will need it and there
is no way to retrieve it later. If the key is lost, a new one must
be created. Protect this key as if it were a password.

Expand this section for a Windows install.
For full instructions, see the [NVIDIA AI Workbench User
Guide](https://docs.nvidia.com/ai-workbench/user-guide/latest/installation/windows.html).
1. Install Prerequisite Software
1. If this machine has an NVIDIA GPU, ensure the GPU drivers are
installed. It is recommended to use the [GeForce
Experience](https://www.nvidia.com/en-us/geforce/geforce-experience/)
tooling to manage the GPU drivers.
2. Install [Docker
Desktop](https://www.docker.com/products/docker-desktop/) for
local container support. Please be mindful of Docker Desktop's
licensing for enterprise use. [Rancher
Desktop](https://rancherdesktop.io/) may be a viable
alternative.
3. *\[OPTIONAL\]* If Visual Studio Code integration is desired,
install [Visual Studio Code](https://code.visualstudio.com/).
2. Download the [NVIDIA AI
Workbench](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/workbench/)
installer and execute it. Authorize Windows to allow the installer
to make changes.
3. Follow the instructions in the installation wizard. If you need to
install WSL2, authorize Windows to make the changes and reboot local
machine when requested. When the system restarts, the NVIDIA AI
Workbench installer should automatically resume.
4. Select Docker as your container runtime.
5. Log into your GitHub Account by using the *Sign in through
GitHub.com* option.
6. Enter your git author information if requested.
Expand this section for a MacOS install.
For full instructions, see the [NVIDIA AI Workbench User
Guide](https://docs.nvidia.com/ai-workbench/user-guide/latest/installation/macos.html).
1. Install Prerequisite Software
1. Install [Docker
Desktop](https://www.docker.com/products/docker-desktop/) for
local container support. Please be mindful of Docker Desktop's
licensing for enterprise use. [Rancher
Desktop](https://rancherdesktop.io/) may be a viable
alternative.
2. *\[OPTIONAL\]* If Visual Studio Code integration is desired,
install [Visual Studio Code](https://code.visualstudio.com/).
When using VSCode on a Mac, an a[dditional step must be
performed](https://code.visualstudio.com/docs/setup/mac#_launching-from-the-command-line)
to install the VSCode CLI interface used by Workbench.
2. Download the [NVIDIA AI
Workbench](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/workbench/)
disk image (*.dmg* file) and open it.
3. Drag AI Workbench into the Applications folder and run *NVIDIA AI
Workbench* from the application launcher. 
4. Select Docker as your container runtime.
5. Log into your GitHub Account by using the *Sign in through
GitHub.com* option.
6. Enter your git author information if requested.
Expand this section for an Ubuntu install.
For full instructions, see the [NVIDIA AI Workbench User
Guide](https://docs.nvidia.com/ai-workbench/user-guide/latest/installation/ubuntu-local.html).
Run this installation as the user who will be user Workbench. Do not run
these steps as `root`.
1. Install Prerequisite Software
1. *\[OPTIONAL\]* If Visual Studio Code integration is desired,
install [Visual Studio Code](https://code.visualstudio.com/).
2. Download the [NVIDIA AI
Workbench](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/workbench/)
installer, make it executable, and then run it. You can make the
file executable with the following command:
``` bash
chmod +x NVIDIA-AI-Workbench-*.AppImage
```
3. AI Workbench will install the NVIDIA drivers for you (if needed).
You will need to reboot your local machine after the drivers are
installed and then restart the AI Workbench installation by
double-clicking the NVIDIA AI Workbench icon on your desktop.
4. Select Docker as your container runtime.
5. Log into your GitHub Account by using the *Sign in through
GitHub.com* option.
6. Enter your git author information if requested.
Expand this section for a remote Ubuntu install.
For full instructions, see the [NVIDIA AI Workbench User
Guide](https://docs.nvidia.com/ai-workbench/user-guide/latest/installation/ubuntu-remote.html).
Run this installation as the user who will be using Workbench. Do not
run these steps as `root`.
1. Ensure SSH Key based authentication is enabled from the local
machine to the remote machine. If this is not currently enabled, the
following commands will enable this is most situations. Change
`REMOTE_USER` and `REMOTE-MACHINE` to reflect your remote address.
- From a Windows local client, use the following PowerShell:
``` powershell
ssh-keygen -f "C:\Users\local-user\.ssh\id_rsa" -t rsa -N '""'
type $env:USERPROFILE\.ssh\id_rsa.pub | ssh REMOTE_USER@REMOTE-MACHINE "cat >> .ssh/authorized_keys"
```
- From a MacOS or Linux local client, use the following shell:
``` bash
if [ ! -e ~/.ssh/id_rsa ]; then ssh-keygen -f ~/.ssh/id_rsa -t rsa -N ""; fi
ssh-copy-id REMOTE_USER@REMOTE-MACHINE
```
2. SSH into the remote host. Then, use the following commands to
download and execute the NVIDIA AI Workbench Installer.
``` bash
mkdir -p $HOME/.nvwb/bin && \
curl -L https://workbench.download.nvidia.com/stable/workbench-cli/$(curl -L -s https://workbench.download.nvidia.com/stable/workbench-cli/LATEST)/nvwb-cli-$(uname)-$(uname -m) --output $HOME/.nvwb/bin/nvwb-cli && \
chmod +x $HOME/.nvwb/bin/nvwb-cli && \
sudo -E $HOME/.nvwb/bin/nvwb-cli install
```
3. AI Workbench will install the NVIDIA drivers for you (if needed).
You will need to reboot your remote machine after the drivers are
installed and then restart the AI Workbench installation by
re-running the commands in the previous step.
4. Select Docker as your container runtime.
5. Log into your GitHub Account by using the *Sign in through
GitHub.com* option.
6. Enter your git author information if requested.
7. Once the remote installation is complete, the Remote Location can be
added to the local AI Workbench instance. Open the AI Workbench
application, click *Add Remote Location*, and then enter the
required information. When finished, click *Add Location*.
- \*Location Name: \* Any short name for this new location
- \*Description: \* Any brief metadata for this location.
- \*Hostname or IP Address: \* The hostname or address used to
remotely SSH. If step 1 was followed, this should be the same as
`REMOTE-MACHINE`.
- \*SSH Port: \* Usually left blank. If a nonstandard SSH port is
used, it can be configured here.
- \*SSH Username: \* The username used for making an SSH connection.
If step 1 was followed, this should be the same as `REMOTE_USER`.
- \*SSH Key File: \* The path to the private key for making SSH
connections. If step 1 was followed, this should be:
`/home/USER/.ssh/id_rsa`.
- \*Workbench Directory: \* Usually left blank. This is where
Workbench will remotely save state.
Expand this section for a details on downloading this project.
1. Open the local NVIDIA AI Workbench window. From the list of
locations displayed, select either the remote one you just set up,
or local if you're going to work locally.

2. Once inside the location, select *Clone Project*.

3. In the 'Clone Project' pop up window, set the Repository URL to
`https://github.com/NVIDIA/nim-anywhere.git`. You can leave the Path
as the default of
`/home/REMOTE_USER/nvidia-workbench/nim-anywhere.git`. Click
*Clone*.\`

4. You will be redirected to the new project’s page. Workbench will
automatically bootstrap the development environment. You can view
real-time progress by expanding the Output from the bottom of the
window.

Expand this section for a details on configuring this project.
1. Before running for the first time, your NGC personal key must be
configured in Workbench. This is done using the *Environment* tab
from the left-hand panel.

2. Scroll down to the **Secrets** section and find the *NGC_API_KEY*
entry. Press *Configure* and provide the personal key for NGC that
was generated earlier.
Expand this section for details on starting the demo application.
> **HINT:** For each application, the debug output can be monitored in
> the UI by clicking the Output link in the lower left corner, selecting
> the dropdown menu, and choosing the application of interest (or
> **Compose** for applications started via compose).
Since you can either pull NIMs and run them locally, or utilize the
endpoints from *ai.nvidia.com* you can run this project with *or*
without GPUs.
1. The applications bundled in this workspace can be controlled by
navigating to two tabs:
- **Environment** \> **Compose**
- **Environment** \> **Applications**
2. First, navigate to the **Environment** \> **Compose** tab. If you're
not working in an environment with GPUs, you can just click
**Start** to run the project using a lightweight deployment. This
default configuration will run the following containers:
- *Milvus Vector DB*: An unstructured knowledge base
- *Redis*: Used to store conversation histories
3. If you have access to GPU resources and want to run any NIMs
locally, use the dropdown menu under **Compose** and select which
set of NIMs you want to run locally. Note that you *must* have at
least 1 available GPU per NIM you plan to run locally. Below is an
outline of the available configurations:
- Local LLM (min 1 GPU required)
- The first time the LLM NIM is started, it will take some time to
download the image and the optimized models.
- During a long start, to confirm the LLM NIM is starting, the
progress can be observed by viewing the logs by using the
*Output* pane on the bottom left of the UI.
- If the logs indicate an authentication error, that means the
provided *NGC_API_KEY* does not have access to the NIMs.
Please verify it was generated correctly and in an NGC
organization that has NVIDIA AI Enterprise support or trial.
- If the logs appear to be stuck on `..........: Pull complete`.
`..........: Verifying complete`, or
`..........: Download complete`; this is all normal output
from Docker that the various layers of the container image
have been downloaded.
- Any other failures here need to be addressed.
- Local LLM + Embedding (min 2 GPUs required)
- Local LLM + Embedding + Reranking (min 3 GPUs required)
> **NOTE:**
>
> - Each profile will also run *Milvus Vector DB* and *Redis*
> - Due to the nature of Docker Compose profiles, the UI will let
> you select multiple profiles at the same time. In the context of
> this project, selecting multiple profiles does not make sense.
> It will not cause any errors, however we recommend only
> selecting one profile at a time for simplicity.
4. Once the compose services have been started, navigate to the
**Environment** \> **Applications** tab. Now, the *Chain Server* can
safely be started. This contains the custom LangChain code for
performing our reasoning chain. By default, it will use the local
Milvus and Redis, but use *ai.nvidia.com* for LLM, Embedding, and
Reranking model inferencing.
5. Once the *Chain Server* is up, the *Chat Frontend* can be started.
Starting the interface will automatically open it in a browser
window. If you are running any local NIMs, you can edit the config
to connect to them via the *Chat Frontend*

LangChain + NIMs
Frontend
Interactive Demo UI
Evaluation
Validate the results
Notebooks
Advanced usage
Integrations
RedisConversation History
MilvusVector Database
LLM NIMOptimized LLMs
```
# Application Configuration
The Chain Server can be configured with either a configuration file or
environment variables.
## Config from a file
By default, the application will search for a configuration file in all
of the following locations. If multiple configuration files are found,
values from lower files in the list will take precedence.
- ./config.yaml
- ./config.yml
- ./config.json
- ~/app.yaml
- ~/app.yml
- ~/app.json
- /etc/app.yaml
- /etc/app.yml
- /etc/app.json
## Config from a custom file
An additional config file path can be specified through an environment
variable named `APP_CONFIG`. The value in this file will take precedence
over all the default file locations.
``` bash
export APP_CONFIG=/etc/my_config.yaml
```
## Config from env vars
Configuration can also be set using environment variables. The variable
names will be in the form: `APP_FIELD__SUB_FIELD` Values specified as
environment variables will take precedence over all values from files.
## Chain Server config schema
``` yaml
# Your API key for authentication to AI Foundation.
# ENV Variables: NGC_API_KEY, NVIDIA_API_KEY, APP_NVIDIA_API_KEY
# Type: string, null
nvidia_api_key: ~
# The Data Source Name for your Redis DB.
# ENV Variables: APP_REDIS_DSN
# Type: string
redis_dsn: redis://localhost:6379/0
llm_model:
# The name of the model to request.
# ENV Variables: APP_LLM_MODEL__NAME
# Type: string
name: meta/llama3-8b-instruct
# The URL to the model API.
# ENV Variables: APP_LLM_MODEL__URL
# Type: string
url: https://integrate.api.nvidia.com/v1
embedding_model:
# The name of the model to request.
# ENV Variables: APP_EMBEDDING_MODEL__NAME
# Type: string
name: nvidia/nv-embedqa-e5-v5
# The URL to the model API.
# ENV Variables: APP_EMBEDDING_MODEL__URL
# Type: string
url: https://integrate.api.nvidia.com/v1
reranking_model:
# The name of the model to request.
# ENV Variables: APP_RERANKING_MODEL__NAME
# Type: string
name: nv-rerank-qa-mistral-4b:1
# The URL to the model API.
# ENV Variables: APP_RERANKING_MODEL__URL
# Type: string
url: https://integrate.api.nvidia.com/v1
milvus:
# The host machine running Milvus vector DB.
# ENV Variables: APP_MILVUS__URL
# Type: string
url: http://localhost:19530
# The name of the Milvus collection.
# ENV Variables: APP_MILVUS__COLLECTION_NAME
# Type: string
collection_name: collection_1
log_level:
```
## Chat Frontend config schema
The chat frontend has a few configuration options as well. They can be
set in the same manner as the chain server.
``` yaml
# The URL to the chain on the chain server.
# ENV Variables: APP_CHAIN_URL
# Type: string
chain_url: http://localhost:3030/
# The url prefix when this is running behind a proxy.
# ENV Variables: PROXY_PREFIX, APP_PROXY_PREFIX
# Type: string
proxy_prefix: /
# Path to the chain server's config.
# ENV Variables: APP_CHAIN_CONFIG_FILE
# Type: string
chain_config_file: ./config.yaml
log_level:
```
# Contributing
All feedback and contributions to this project are welcome. When making
changes to this project, either for personal use or for contributing, it
is recommended to work on a fork on this project. Once the changes have
been completed on the fork, a Merge Request should be opened.
## Code Style
This project has been configured with Linters that have been tuned to
help the code remain consistent while not being overly burdensome. We
use the following Linters:
- Bandit is used for security scanning
- Pylint is used for Python Syntax Linting
- MyPy is used for type hint linting
- Black is configured for code styling
- A custom check is run to ensure Jupyter Notebooks do not have any
output
- Another custom check is run to ensure the README.md file is up to date
The embedded VSCode environment is configured to run the linting and
checking in realtime.
To manually run the linting that is done by the CI pipelines, execute
`/project/code/tools/lint.sh`. Individual tests can be run be specifying
them by name:
`/project code/tools/lint.sh [deps|pylint|mypy|black|docs|fix]`. Running
the lint tool in fix mode will automatically correct what it can by
running Black, updating the README, and clearing the cell output on all
Jupyter Notebooks.
## Updating the frontend
The frontend has been designed in an effort to minimize the required
HTML and Javascript development. A branded and styled Application Shell
is provided that has been created with vanilla HTML, Javascript, and
CSS. It is designed to be easy to customize, but it should never be
required. The interactive components of the frontend are all created in
Gradio and mounted in the app shell using iframes.
Along the top of the app shell is a menu listing the available views.
Each view may have its own layout consisting of one or a few pages.
### Creating a new page
Pages contain the interactive components for a demo. The code for the
pages is in the `code/frontend/pages` directory. To create a new page:
1. Create a new folder in the pages directory
2. Create an `__init__.py` file in the new directory that uses Gradio
to define the UI. The Gradio Blocks layout should be defined in a
variable called `page`.
3. It is recommended that any CSS and JS files needed for this view be
saved in the same directory. See the `chat` page for an example.
4. Open the `code/frontend/pages/__init__.py` file, import the new
page, and add the new page to the `__all__` list.
> **NOTE:** Creating a new page will not add it to the frontend. It must
> be added to a view to appear on the Frontend.
### Adding a view
View consist of one or a few pages and should function independently of
each other. Views are all defined in the `code/frontend/server.py`
module. All declared views will automatically be added to the Frontend's
menu bar and made available in the UI.
To define a new view, modify the list named `views`. This is a list of
`View` objects. The order of the objects will define their order in the
Frontend menu. The first defined view will be the default.
View objects describe the view name and layout. They can be declared as
follow:
``` python
my_view = frontend.view.View(
name="My New View", # the name in the menu
left=frontend.pages.sample_page, # the page to show on the left
right=frontend.pages.another_page, # the page to show on the right
)
```
All of the page declarations, `View.left` or `View.right`, are optional.
If they are not declared, then the associated iframes in the web layout
will be hidden. The other iframes will expand to fill the gaps. The
following diagrams show the various layouts.
- All pages are defined
``` mermaid
block-beta
columns 1
menu["menu bar"]
block
columns 2
left right
end
```
- Only left is defined
``` mermaid
block-beta
columns 1
menu["menu bar"]
block
columns 1
left:1
end
```
### Frontend branding
The frontend contains a few branded assets that can be customized for
different use cases.
#### Logo
The frontend contains a logo on the top left of the page. To modify the
logo, an SVG of the desired logo is required. The app shell can then be
easily modified to use the new SVG by modifying the
`code/frontend/_assets/index.html` file. There is a single `div` with an
ID of `logo`. This box contains a single SVG. Update this to the desired
SVG definition.
``` html