> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mira.network/llms.txt
> Use this file to discover all available pages before exploring further.

# Implement RAG Capabilities

> Learn how to enhance your flows with Retrieval-Augmented Generation (RAG) using Mira Flows SDK

## Video Tutorial

<iframe width="560" height="315" src="https://www.youtube.com/embed/xs2gNPx_XNw" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

<br />

<Tip>Check out this <a href="https://github.com/B-Venkatesh7210/configure-datasets" target="_blank">Github Repository</a> for a quick overview on how to Implement RAG Capabilities.</Tip>

<br />

### Understanding RAG Implementation

**Retrieval-Augmented Generation (RAG)** enhances your flows with specific domain knowledge. The implementation process involves three main stages:

1. 🗂️ **Create a Dataset**: Establish a knowledge base that contains specialized information your flow will reference during execution.

2. ➕ **Add Data Sources**: Populate your dataset with relevant information from various supported sources.

3. 🔗 **Link Dataset to Flow**: Connect the dataset to your flow, enabling it to leverage this information during processing.

### Creating Your Dataset

Begin by establishing your knowledge base:

```python Python theme={null}
from mira_sdk import MiraClient

client = MiraClient(config={"API_KEY": "YOUR_API_KEY"})        # Initialize client

# Create a new dataset
client.dataset.create(
    "author/dataset_name",                                     # Unique identifier
    "Description of your knowledge base"                       # Dataset purpose
)
```

### Adding Data Sources

Populate your dataset with information from various supported formats:

```python Python theme={null}
# Add a PDF document as data source
client.dataset.add_source("author/dataset_name", file_path="document.pdf")

# Add and URL as data source
client.dataset.add_source("author/dataset_name", url="https://example.com/data")

# Add multiple URL sources via a CSV file
client.dataset.add_source("author/dataset_name", file_path="sources.csv")
```

### Linking Dataset to Flow

Connect your dataset to an existing flow by modifying its configuration:

```yaml .yaml theme={null}
# Additional flow configuration with RAG

dataset:
  source: "author/dataset_name"         # Link to your dataset

# Rest of your flow configuration remains unchanged
```

### Example

<Card title="Implement RAG capabilities" icon="github" href="https://github.com/B-Venkatesh7210/configure-datasets">
  Complete example showcasing how to set up and configure datasets for
  Retrieval Augmented Generation (RAG) capabilities.
</Card>
