Analyze Data with AI

You can use Daytona Sandbox to run AI-generated code to analyze data. Here’s how the AI data analysis workflow typically looks:

Your user has a dataset in CSV format or other formats.
You prompt the LLM to generate code (usually Python) based on the user’s data.
The sandbox runs the AI-generated code and returns the results.
You display the results to the user.

Build an AI Data Analyst with Daytona

This example shows how to build an AI-powered data analyst that automatically generates insights and visualizations from CSV data using Daytona’s secure sandbox environment.

What we’ll build: A system that analyzes a vehicle valuation dataset, identifies price relation to manufacturing year, and generates professional visualizations - all through natural language prompts to Claude.

1. Project Setup

1.1 Install Dependencies

Install the Daytona SDK and Anthropic SDK to your project:

Python
TypeScript

bash pip install daytona anthropic python-dotenv

bash npm install @daytonaio/sdk @anthropic-ai/sdk dotenv

1.2 Configure Environment

Get your API keys and configure your environment:

Daytona API key: Get it from Daytona Dashboard
Anthropic API key: Get it from Anthropic Console

Create a .env file in your project:

DAYTONA_API_KEY=dtn_***
ANTHROPIC_API_KEY=sk-ant-***

2. Dataset Preparation

2.1 Download Dataset

We’ll be using a publicly available dataset of vehicle valuation. You can download it directly from:

https://download.daytona.io/dataset.csv

Download the file and save it as dataset.csv in your project directory.

2.2 Initialize Sandbox

Now create a Daytona sandbox and upload your dataset:

Python
TypeScript

from dotenv import load_dotenv
from daytona import Daytona
import os

load_dotenv()

# Create sandbox

daytona = Daytona()
sandbox = daytona.create()

# Upload the dataset to the sandbox

sandbox.fs.upload_file("dataset.csv", "/home/daytona/dataset.csv")

import 'dotenv/config'
import { Daytona } from '@daytonaio/sdk';

// Create sandbox
const daytona = new Daytona();
const sandbox = await daytona.create()

// Upload the dataset to the sandbox
await sandbox.fs.uploadFile('dataset.csv', '/home/daytona/dataset.csv')

3. Building the AI Data Analyst

Now we’ll create the core functionality that connects Claude with Daytona to analyze data and generate visualizations.

3.1 Code Execution Handler

First, let’s create a function to handle code execution and chart extraction:

Python
TypeScript

import base64

def run_ai_generated_code(sandbox, ai_generated_code):
  execution = sandbox.process.code_run(ai_generated_code)
  if execution.exit_code != 0:
    print('AI-generated code had an error.')
    print(execution.exit_code)
    print(execution.result)
    return

  # Check for charts in execution artifacts
  if not execution.artifacts or not execution.artifacts.charts:
    print('No charts found in execution artifacts')
    return

  result_idx = 0
  for result in execution.artifacts.charts:
    if result.png: # Save the png to a file (png is in base64 format)
      with open(f'chart-{result_idx}.png', 'wb') as f:
        f.write(base64.b64decode(result.png))
        print(f'Chart saved to chart-{result_idx}.png')
        result_idx += 1

import fs from 'fs'

async function runAIGeneratedCode(sandbox, aiGeneratedCode: string) {
  const execution = await sandbox.process.codeRun(aiGeneratedCode)
  if (execution.exitCode != 0) {
    console.error('AI-generated code had an error.')
    console.log(execution.exitCode)
    console.log(execution.result)
    return
  }

  // Check for charts in execution artifacts
  if (!execution.artifacts || !execution.artifacts.charts) {
    console.log('No charts found in execution artifacts')
    return
  }

  let resultIdx = 0
  for (const result of execution.artifacts.charts) {
    if (result.png) {
      // Save the png to a file (png is in base64 format)
      fs.writeFileSync(`chart-${resultIdx}.png`, result.png, { encoding: 'base64' })
      console.log(`Chart saved to chart-${resultIdx}.png`)
      resultIdx++
    }
  }
}

3.2 Creating the Analysis Prompt

Next, we’ll create the prompt that tells Claude about our dataset and what analysis we want. This prompt includes:

Dataset schema and column descriptions
The specific analysis request (vote average trends over time)
Instructions for code generation

Python
TypeScript

from anthropic import Anthropic

prompt = f"""
I have a CSV file with vehicle valuations saved in the sandbox at /home/daytona/dataset.csv.

Relevant columns:
- 'year': integer, the manufacturing year of the vehicle
- 'price_in_euro': float, the listed price of the vehicle in Euros

Analyze how price varies by manufacturing year.
Drop rows where 'year' or 'price_in_euro' is missing, non-numeric, or an outlier.
Create a line chart showing average price per year.
Write Python code that analyzes the dataset based on my request and produces right chart accordingly.
Finish with a plt.show()"""

anthropic = Anthropic()
print('Waiting for the model response...')

import Anthropic from '@anthropic-ai/sdk'

const prompt = `
I have a CSV file with vehicle valuations saved in the sandbox at /home/daytona/dataset.csv.

Relevant columns:
- 'year': integer, the manufacturing year of the vehicle
- 'price_in_euro': float, the listed price of the vehicle in Euros

Analyze how price varies by manufacturing year.
Drop rows where 'year' or 'price_in_euro' is missing, non-numeric, or an outlier.
Create a line chart showing average price per year.
Write Python code that analyzes the dataset based on my request and produces right chart accordingly.
Finish with a plt.show()`

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY })
console.log('Waiting for the model response...')

3.3 Tool Calling Setup

Now we’ll connect Claude to our Daytona sandbox using tool calling. This allows Claude to automatically execute the Python code it generates:

Python
TypeScript

msg = anthropic.messages.create(
    model='claude-3-5-sonnet-20240620',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': prompt}],
    tools=[
        {
            'name': 'run_python_code',
            'description': 'Run Python code',
            'input_schema': {
                'type': 'object',
                'properties': {
                    'code': {
                        'type': 'string',
                        'description': 'The Python code to run',
                    },
                },
                'required': ['code'],
            },
        },
    ],
)

const msg = await anthropic.messages.create({
  model: 'claude-3-5-sonnet-20240620',
  max_tokens: 1024,
  messages: [{ role: 'user', content: prompt }],
  tools: [
    {
      name: 'run_python_code',
      description: 'Run Python code',
      input_schema: {
        type: 'object',
        properties: {
          code: {
            type: 'string',
            description: 'The Python code to run',
          },
        },
        required: ['code'],
      },
    },
  ],
})

3.4 Response Processing

Finally, we’ll parse Claude’s response and execute any generated code in our Daytona sandbox:

Python
TypeScript

for content_block in msg.content:
    if content_block.type == 'tool_use':
        if content_block.name == 'run_python_code':
            code = content_block.input['code']
            print('Will run following code in the Sandbox:\n', code)
            # Execute the code in the sandbox
            run_ai_generated_code(sandbox, code)

interface CodeRunToolInput {
  code: string
}

for (const contentBlock of msg.content) {
  if (contentBlock.type === 'tool_use') {
    if (contentBlock.name === 'run_python_code') {
      const code = (contentBlock.input as CodeRunToolInput).code
      console.log('Will run following code in the Sandbox:\n', code)
      // Execute the code in the sandbox
      await runAIGeneratedCode(sandbox, code)
    }
  }
}

That’s it! The run_ai_generated_code function we created automatically handles saving charts. When Claude generates a visualization with plt.show(), Daytona captures it as a chart artifact and saves it as a PNG file.

Key advantages of this approach:

Secure execution: Code runs in isolated Daytona sandboxes
Automatic artifact capture: Charts, tables, and outputs are automatically extracted
Error handling: Built-in error detection and logging
Language agnostic: While we used Python here, Daytona supports multiple languages

4. Running Your Analysis

Now you can run the complete code to see the results.

Python
TypeScript

python data-analysis.py

npx tsx data-analysis.ts

You should see the chart in your project directory that will look similar to this:

Vehicle valuation by manufacturing year chart

5. Complete Implementation

Here are the complete, ready-to-run examples:

Python
TypeScript

import base64
from dotenv import load_dotenv
from daytona import Daytona, Sandbox
from anthropic import Anthropic


def main():
    load_dotenv()
    # Create sandbox
    daytona = Daytona()
    sandbox = daytona.create()

    # Upload the dataset to the sandbox
    sandbox.fs.upload_file("dataset.csv", "/home/daytona/dataset.csv")

    prompt = f"""
I have a CSV file with vehicle valuations saved in the sandbox at /home/daytona/dataset.csv.

Relevant columns:
- 'year': integer, the manufacturing year of the vehicle
- 'price_in_euro': float, the listed price of the vehicle in Euros

Analyze how price varies by manufacturing year.
Drop rows where 'year' or 'price_in_euro' is missing, non-numeric, or an outlier.
Create a line chart showing average price per year.
Write Python code that analyzes the dataset based on my request and produces right chart accordingly.
Finish with a plt.show()"""

    anthropic = Anthropic()
    print('Waiting for the model response...')
    msg = anthropic.messages.create(
        model='claude-3-5-sonnet-20240620',
        max_tokens=1024,
        messages=[{'role': 'user', 'content': prompt}],
        tools=[
            {
                'name': 'run_python_code',
                'description': 'Run Python code',
                'input_schema': {
                    'type': 'object',
                    'properties': {
                        'code': {
                            'type': 'string',
                            'description': 'The Python code to run',
                        },
                    },
                    'required': ['code'],
                },
            },
        ],
    )

    for content_block in msg.content:
        if content_block.type == 'tool_use':
            if content_block.name == 'run_python_code':
                code = content_block.input['code']
                print('Will run following code in the Sandbox:\n', code)
                # Execute the code in the sandbox
                run_ai_generated_code(sandbox, code)


def run_ai_generated_code(sandbox: Sandbox, ai_generated_code: str):
    execution = sandbox.process.code_run(ai_generated_code)
    if execution.exit_code != 0:
        print('AI-generated code had an error.')
        print(execution.exit_code)
        print(execution.result)
        return

    # Iterate over all the results and specifically check for png files that will represent the chart.
    if not execution.artifacts or not execution.artifacts.charts:
        print('No charts found in execution artifacts')
        print(execution.artifacts)
        return

    result_idx = 0
    for result in execution.artifacts.charts:
        if result.png:
            # Save the png to a file
            # The png is in base64 format.
            with open(f'chart-{result_idx}.png', 'wb') as f:
                f.write(base64.b64decode(result.png))
            print(f'Chart saved to chart-{result_idx}.png')
            result_idx += 1


if __name__ == "__main__":
    main()

import 'dotenv/config'
import fs from 'fs'
import Anthropic from '@anthropic-ai/sdk'
import { Daytona, Sandbox } from '@daytonaio/sdk';


async function main() {
  // Create sandbox
  const daytona = new Daytona();
  const sandbox = await daytona.create()

  // Upload the dataset to the sandbox
  await sandbox.fs.uploadFile('dataset.csv', '/home/daytona/dataset.csv')

  const prompt = `
I have a CSV file with vehicle valuations saved in the sandbox at /home/daytona/dataset.csv.

Relevant columns:
- 'year': integer, the manufacturing year of the vehicle
- 'price_in_euro': float, the listed price of the vehicle in Euros

Analyze how price varies by manufacturing year.
Drop rows where 'year' or 'price_in_euro' is missing, non-numeric, or an outlier.
Create a line chart showing average price per year.
Write Python code that analyzes the dataset based on my request and produces right chart accordingly.
Finish with a plt.show()`

  const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY })
  console.log('Waiting for the model response...')
  const msg = await anthropic.messages.create({
    model: 'claude-3-5-sonnet-20240620',
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
    tools: [
      {
        name: 'run_python_code',
        description: 'Run Python code',
        input_schema: {
          type: 'object',
          properties: {
            code: {
              type: 'string',
              description: 'The Python code to run',
            },
          },
          required: ['code'],
        },
      },
    ],
  })

  interface CodeRunToolInput {
    code: string
  }

  for (const contentBlock of msg.content) {
    if (contentBlock.type === 'tool_use') {
      if (contentBlock.name === 'run_python_code') {
        const code = (contentBlock.input as CodeRunToolInput).code
        console.log('Will run following code in the Sandbox:\n', code)
        // Execute the code in the sandbox
        await runAIGeneratedCode(sandbox, code)
      }
    }
  }
}

async function runAIGeneratedCode(sandbox: Sandbox, aiGeneratedCode: string) {
  const execution = await sandbox.process.codeRun(aiGeneratedCode)
  if (execution.exitCode != 0) {
    console.error('AI-generated code had an error.')
    console.log(execution.exitCode)
    console.log(execution.result)
    process.exit(1)
  }
  // Iterate over all the results and specifically check for png files that will represent the chart.
  if (!execution.artifacts || !execution.artifacts.charts) {
    console.log('No charts found in execution artifacts')
    console.log(execution.artifacts)
    return
  }

  let resultIdx = 0
  for (const result of execution.artifacts.charts) {
    if (result.png) {
      // Save the png to a file
      // The png is in base64 format.
      fs.writeFileSync(`chart-${resultIdx}.png`, result.png, { encoding: 'base64' })
      console.log(`Chart saved to chart-${resultIdx}.png`)
      resultIdx++
    }
  }
}

main().catch(console.error);