# Contents

Development Container Features (Devcontainer Features) are essential for standardizing development environments across various teams and projects.

They allow developers to easily share and reuse environment setups, ensuring consistency and reducing the time spent configuring environments. Devcontainer features can be customized to meet the specific needs of a project, making them highly versatile and effective for various development scenarios.

In this guide, we will walk you through creating a Devcontainer feature, using Hugging Face as an example. Hugging Face is a leading library for machine learning (ML) and natural language processing (NLP) tasks, making it an excellent choice for demonstrating how to create a feature that can be widely beneficial. By the end of this guide, you will have developed a fully functional Devcontainer feature that can be seamlessly integrated into any development workflow.

Why Create a Devcontainer Feature?

Devcontainer is essential for teams that need to ensure a consistent development environment. They allow developers to easily add specific tools, libraries, or configurations to their development environments, ensuring consistency across different projects and teams. This is particularly useful in complex projects involving ML and NLP, where setting up environments can be time-consuming and error-prone.

Using Hugging Face as an example highlights the importance of having a pre-configured environment tailored for ML and NLP tasks. This feature can save developers time and reduce the potential for setup errors, enabling them to focus more on development and less on configuration.

Development Container Features are self-contained, shareable units of installation code and development container configuration. By using Devcontainer features, teams can share these optimized environments, ensuring that all members have access to a consistent and reliable setup that is tailored to the specific needs of the project.

Prerequisites

To follow along with this guide, you should have a basic understanding of Pythonshell scripting and containerization.

You will also need the following tools installed: DockerVisual Studio Code, and the Remote - Containers extension.

  • Docker: Used for containerizing applications, ensuring consistency across different environments.

  • Visual Studio Code: An Integrated Development Environment (IDE) that supports Devcontainer development.

  • Remote - Containers Extension: A VS Code extension that allows you to open any folder inside a container and take advantage of Visual Studio Code’s full feature set.

  • Dev Container CLI: A command-line tool used for working with Devcontainer features, enabling you to test and manage your Devcontainer setups.

Installing the Dev Container CLI

To install the Dev Container CLI, you can use npm:

1npm install -g @devcontainers/cli

This command installs the CLI globally, making it available for use in testing and managing Devcontainer features throughout this guide.

TL;DR
  • Learn to create a Devcontainer feature, enhancing development consistency and efficiency.

  • Benefits: Standardizes environments, reduces setup time, and supports ML/NLP tasks.

  • Tools Needed: Docker, VS Code, Remote - Containers extension, Dev Container CLI.

  • Learn on Hugging Face Devcontainer feature for consistent ML/NLP environments

Step 1: Preparations

Fork the Starter Repository

To create your custom Devcontainer feature, start by forking the feature-starter repository on GitHub. This repository provides a solid foundation for building new Devcontainer features.

  • Step 1.1: Fork the Repository

Click on the "Fork" button at the top right of the repository page to create a local copy under your GitHub account.

  • Step 1.2: Clone this repository:

Once forked, clone the repository to your local machine using the following commands:

1git clone https://github.com/your-username/feature-starter.git
2cd feature-starter
  • Step 1.3: Open the project in VS Code:

1code .
  • Step 1.4: When prompted, click "Reopen in Container" to develop inside a container with the necessary tools installed.

Note

Ensure your development environment is set up with Docker, Visual Studio Code and the Remote - Containers extension

Step 2: Main Process

Create a New Feature Directory

Inside the src directory of your forked repository, create a new folder named huggingface. This folder will contain all the necessary configuration files and scripts for the Hugging Face feature.

1mkdir -p src/huggingface

Configure devcontainer-feature.json

The devcontainer-feature.json file defines the metadata and options for the Hugging face feature. Below is an example configuration:

1{
2 "name": "Hugging Face",
3 "id": "huggingface",
4 "version": "1.0.3",
5 "description": "Installs Hugging Face libraries and tools for NLP and ML tasks.",
6 "options": {
7 "version": {
8 "type": "string",
9 "proposals": ["latest", "4.28.1", "4.27.4", "4.26.0"],
10 "default": "latest",
11 "description": "Select the version of Hugging Face Transformers to install."
12 },
13 "cuda": {
14 "type": "boolean",
15 "default": false,
16 "description": "Enable this option to install the CUDA-enabled version of PyTorch for GPU support."
17 },
18 "datasets_version": {
19 "type": "string",
20 "proposals": ["latest", "1.14.0", "1.13.3"],
21 "default": "latest",
22 "description": "Select the version of Hugging Face Datasets to install."
23 },
24 "tokenizers_version": {
25 "type": "string",
26 "proposals": ["latest", "0.11.6", "0.11.5"],
27 "default": "latest",
28 "description": "Select the version of Tokenizers to install."
29 }
30 },
31 "installsAfter": ["ghcr.io/devcontainers/features/common-utils"]
32}
Note

Ensure that descriptions are clear and informative, making it easy for users to understand what each option does. This step is crucial for providing a seamless user experience

Implement install.sh

The install.sh script is the core of the feature’s installation process. Below is an optimized version of the script that incorporates modular functions and error handling.

1#!/bin/bash
2set -e
3
4echo "Activating feature 'huggingface'"
5
6# Function to install Python packages
7install_python_package() {
8 local package_name=$1
9 local package_version=$2
10 if [ "$package_version" = "latest" ]; then
11 pip install $package_name
12 else
13 pip install "$package_name==$package_version"
14 fi
15}
16
17# Ensure Python and pip are available
18if ! command -v python3 &> /dev/null; then
19 apt-get update && apt-get install -y python3 python3-pip python3-venv
20else
21 echo "Python3 is already installed."
22fi
23
24# Set up a virtual environment if not present
25VENV_PATH="/opt/huggingface-venv"
26if [ ! -d "$VENV_PATH" ]; then
27 python3 -m venv $VENV_PATH
28 echo "Virtual environment created at $VENV_PATH."
29else
30 echo "Using existing virtual environment at $VENV_PATH."
31fi
32
33# Activate the virtual environment
34source $VENV_PATH/bin/activate
35
36# Upgrade pip to the latest version
37pip install --upgrade pip
38
39# Install required Python packages
40install_python_package "transformers" "$VERSION"
41
42# Install PyTorch with or without CUDA support
43if [ "$CUDA" = "true" ]; then
44 pip install torch torchvision torchaudio
45else
46 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
47fi
48
49install_python_package "datasets" "$DATASETS_VERSION"
50install_python_package "tokenizers" "$TOKENIZERS_VERSION"
51pip install sentencepiece huggingface_hub
52
53# Create a script to activate the virtual environment
54cat <<EOL > /usr/local/bin/activate-huggingface
55#!/bin/bash
56source $VENV_PATH/bin/activate
57EOL
58chmod +x /usr/local/bin/activate-huggingface
59
60echo "Hugging Face setup completed successfully."
Note

The script allows users to customize their installation through the provided options, making the feature flexible for various use cases

Step 3: Confirmation

After implementing the feature, confirm that everything works as expected by testing the feature locally. For testing use GitHub Actions to automatically test the feature.

Adding a Basic Test Script

To ensure that the feature installs the correct versions of the Hugging Face libraries, create a basic test script. This test will validate the installation by checking the versions of the installed packages.

Create a new directory named test inside the feature root directory:

1mkdir -p src/huggingface/test
2cd test

Inside this directory, create a test script version_check.sh:

1#!/bin/bash
2set -e
3
4check_version() {
5 local package_name=$1
6 local expected_version=$2
7 local installed_version=$(python -c "import $package_name; print($package_name.__version__)")
8
9 if [ "$installed_version" == "$expected_version" ]; then
10 echo "$package_name version is correct: $installed_version"
11 else
12 echo "$package_name version is incorrect: $installed_version (expected: $expected_version)"
13 exit 1
14 fi
15}
16
17# Example checks
18check_version "transformers" "$VERSION"
19check_version "torch" "$TORCH_VERSION"

This script will ensure that the versions of transformerstorch, and any other critical packages match the expected versions.

Integrate the Test Script into the Devcontainer

In the devcontainer-feature.json file, which you have configured earlier in Step 2, add the following postCreateCommand at the end of the file to run the test automatically after the container is created:

1"postCreateCommand": "bash /workspaces/src/huggingface/test/version_check.sh"

This configuration will automatically run the version_check.sh script after the Devcontainer is created, ensuring that all installations are correct and up-to-date.

Running the Test

You can also test locally using the Dev Container CLI:

1devcontainer features test -f huggingface -i mcr.microsoft.com/devcontainers/base:ubuntu .

Ensure all installations are completed successfully and that the Hugging Face environment is functional.

Step 4: Automated Testing

To automate testing whenever changes are pushed to your repository, set up GitHub Actions. Automated tests help catch issues early in the development process, ensuring that your feature remains reliable as it evolves.

Note

Never skip the testing phase; it ensures reliability and functionality. Testing not only validates your code but also builds confidence in the stability of the feature.

Step:5 Publish the Feature

Once you have completed the feature and tested it, publish it to your repository. This makes the feature to publish it using GitHub Actions. This process will make your feature available for others to use and contribute to.

Push Changes to GitHub

Use the following commands to push the developed feature into your local repository.

1git add .
2git commit -m "Add Hugging Face devcontainer feature"
3git push origin main

Run the Workflow

  1. Navigate to the Actions tab in your GitHub repository.

  2. Run the Release dev container features & Generate Documentation workflow. This workflow will package your feature and update the documentation automatically

Set Visibility

Ensure that the package visibility is set to public in the repository's Packages settings. This step is essential for making your feature accessible to the community.

Examples of Great Devcontainer Features

To inspire your work, here are a few examples of highly effective Devcontainer features:

  1. Github CLI: A feature that sets up the GitHub CLI in Devcontainer, making it easy to manage GitHub repositories from within your containerized environment.

  2. Kubectl-Helm-Minikube: This feature installs the latest version of kubectl, Helm, and optionally minikube. Auto-detects latest versions and installs needed dependencies.

  3. Docker-in-Docker: This feature enables Docker to run inside a container, allowing you to test Dockerized applications from within your Devcontainer.

  4. Node.js: A feature that installs Node.js along with npm or Yarn, enabling JavaScript and TypeScript development in a consistent environment.

  5. Python: Sets up Python with popular tools like pip, Poetry, and venv, ensuring a reliable Python environment for development.

For more features available refer here

Conclusion

In this guide, we have walked through the essential steps to create a Devcontainer feature, focusing on setting up a Hugging Face environment and emphasizing the importance of automating and simplifying the setup process for complex tools used in ML and NLP. From forking the starter repository to implementing the install.sh script, we have covered how to define metadata, configure options, and test the feature locally. These steps ensure that your feature is both robust and flexible, meeting the needs of various development scenarios.

One of the key takeaways from this guide is the importance of customization and testing. By providing users with configurable options, such as selecting specific versions of libraries or enabling CUDA support, you are making the feature adaptable to different project requirements. Moreover, testing your feature both locally and through automated CI pipelines ensures reliability and helps catch potential issues early, contributing to a smoother development experience for your users.

As you move forward, remember that Devcontainer features are a powerful tool for standardizing development environments across teams. By creating your own features, you can streamline the setup process, reduce the chances of misconfiguration, and ultimately increase productivity. Now that you have the knowledge and tools, consider contributing back to the community by sharing your features, or explore the many existing features available here to enhance your own development workflow.

You can find the complete implementation of the Hugging Face Devcontainer feature in the Hugging Face Devcontainer Feature Repository. This repository includes all the scripts and configurations discussed in this guide, allowing you to easily clone and adapt the feature for your own projects.

Tags::
  • devcontainer
  • huggingface
  • feature