Development Container Features (Devcontainer Features) are essential for standardizing development environments across various teams and projects.
They allow developers to easily share and reuse environment setups, ensuring consistency and reducing the time spent configuring environments. Devcontainer features can be customized to meet the specific needs of a project, making them highly versatile and effective for various development scenarios.
In this guide, we will walk you through creating a Devcontainer feature, using Hugging Face as an example. Hugging Face is a leading library for machine learning (ML) and natural language processing (NLP) tasks, making it an excellent choice for demonstrating how to create a feature that can be widely beneficial. By the end of this guide, you will have developed a fully functional Devcontainer feature that can be seamlessly integrated into any development workflow.
Why Create a Devcontainer Feature?
Devcontainer is essential for teams that need to ensure a consistent development environment. They allow developers to easily add specific tools, libraries, or configurations to their development environments, ensuring consistency across different projects and teams. This is particularly useful in complex projects involving ML and NLP, where setting up environments can be time-consuming and error-prone.
Using Hugging Face as an example highlights the importance of having a pre-configured environment tailored for ML and NLP tasks. This feature can save developers time and reduce the potential for setup errors, enabling them to focus more on development and less on configuration.
Development Container Features are self-contained, shareable units of installation code and development container configuration. By using Devcontainer features, teams can share these optimized environments, ensuring that all members have access to a consistent and reliable setup that is tailored to the specific needs of the project.
Prerequisites
To follow along with this guide, you should have a basic understanding of Python, shell scripting and containerization.
You will also need the following tools installed: Docker, Visual Studio Code, and the Remote - Containers extension.
Docker: Used for containerizing applications, ensuring consistency across different environments.
Visual Studio Code: An Integrated Development Environment (IDE) that supports Devcontainer development.
Remote - Containers Extension: A VS Code extension that allows you to open any folder inside a container and take advantage of Visual Studio Code’s full feature set.
Dev Container CLI: A command-line tool used for working with Devcontainer features, enabling you to test and manage your Devcontainer setups.
Installing the Dev Container CLI
To install the Dev Container CLI, you can use npm:
1npm install -g @devcontainers/cli
This command installs the CLI globally, making it available for use in testing and managing Devcontainer features throughout this guide.
TL;DR
Learn to create a Devcontainer feature, enhancing development consistency and efficiency.
Benefits: Standardizes environments, reduces setup time, and supports ML/NLP tasks.
Tools Needed: Docker, VS Code, Remote - Containers extension, Dev Container CLI.
Learn on Hugging Face Devcontainer feature for consistent ML/NLP environments
Step 1: Preparations
Fork the Starter Repository
To create your custom Devcontainer feature, start by forking the feature-starter repository on GitHub. This repository provides a solid foundation for building new Devcontainer features.
Step 1.1: Fork the Repository
Click on the "Fork" button at the top right of the repository page to create a local copy under your GitHub account.
Step 1.2: Clone this repository:
Once forked, clone the repository to your local machine using the following commands:
1git clone https://github.com/your-username/feature-starter.git2cd feature-starter
Step 1.3: Open the project in VS Code:
1code .
Step 1.4: When prompted, click "Reopen in Container" to develop inside a container with the necessary tools installed.
Note
Ensure your development environment is set up with Docker, Visual Studio Code and the Remote - Containers extension
Step 2: Main Process
Create a New Feature Directory
Inside the src
directory of your forked repository, create a new folder named huggingface
. This folder will contain all the necessary configuration files and scripts for the Hugging Face feature.
1mkdir -p src/huggingface
Configure devcontainer-feature.json
The devcontainer-feature.json
file defines the metadata and options for the Hugging face feature. Below is an example configuration:
1{2 "name": "Hugging Face",3 "id": "huggingface",4 "version": "1.0.3",5 "description": "Installs Hugging Face libraries and tools for NLP and ML tasks.",6 "options": {7 "version": {8 "type": "string",9 "proposals": ["latest", "4.28.1", "4.27.4", "4.26.0"],10 "default": "latest",11 "description": "Select the version of Hugging Face Transformers to install."12 },13 "cuda": {14 "type": "boolean",15 "default": false,16 "description": "Enable this option to install the CUDA-enabled version of PyTorch for GPU support."17 },18 "datasets_version": {19 "type": "string",20 "proposals": ["latest", "1.14.0", "1.13.3"],21 "default": "latest",22 "description": "Select the version of Hugging Face Datasets to install."23 },24 "tokenizers_version": {25 "type": "string",26 "proposals": ["latest", "0.11.6", "0.11.5"],27 "default": "latest",28 "description": "Select the version of Tokenizers to install."29 }30 },31 "installsAfter": ["ghcr.io/devcontainers/features/common-utils"]32}
Note
Ensure that descriptions are clear and informative, making it easy for users to understand what each option does. This step is crucial for providing a seamless user experience
Implement install.sh
The install.sh
script is the core of the feature’s installation process. Below is an optimized version of the script that incorporates modular functions and error handling.
1#!/bin/bash2set -e34echo "Activating feature 'huggingface'"56# Function to install Python packages7install_python_package() {8 local package_name=$19 local package_version=$210 if [ "$package_version" = "latest" ]; then11 pip install $package_name12 else13 pip install "$package_name==$package_version"14 fi15}1617# Ensure Python and pip are available18if ! command -v python3 &> /dev/null; then19 apt-get update && apt-get install -y python3 python3-pip python3-venv20else21 echo "Python3 is already installed."22fi2324# Set up a virtual environment if not present25VENV_PATH="/opt/huggingface-venv"26if [ ! -d "$VENV_PATH" ]; then27 python3 -m venv $VENV_PATH28 echo "Virtual environment created at $VENV_PATH."29else30 echo "Using existing virtual environment at $VENV_PATH."31fi3233# Activate the virtual environment34source $VENV_PATH/bin/activate3536# Upgrade pip to the latest version37pip install --upgrade pip3839# Install required Python packages40install_python_package "transformers" "$VERSION"4142# Install PyTorch with or without CUDA support43if [ "$CUDA" = "true" ]; then44 pip install torch torchvision torchaudio45else46 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu47fi4849install_python_package "datasets" "$DATASETS_VERSION"50install_python_package "tokenizers" "$TOKENIZERS_VERSION"51pip install sentencepiece huggingface_hub5253# Create a script to activate the virtual environment54cat <<EOL > /usr/local/bin/activate-huggingface55#!/bin/bash56source $VENV_PATH/bin/activate57EOL58chmod +x /usr/local/bin/activate-huggingface5960echo "Hugging Face setup completed successfully."
Note
The script allows users to customize their installation through the provided options, making the feature flexible for various use cases
Step 3: Confirmation
After implementing the feature, confirm that everything works as expected by testing the feature locally. For testing use GitHub Actions to automatically test the feature.
Adding a Basic Test Script
To ensure that the feature installs the correct versions of the Hugging Face libraries, create a basic test script. This test will validate the installation by checking the versions of the installed packages.
Create a new directory named test
inside the feature root directory:
1mkdir -p src/huggingface/test2cd test
Inside this directory, create a test script version_check.sh
:
1#!/bin/bash2set -e34check_version() {5 local package_name=$16 local expected_version=$27 local installed_version=$(python -c "import $package_name; print($package_name.__version__)")89 if [ "$installed_version" == "$expected_version" ]; then10 echo "$package_name version is correct: $installed_version"11 else12 echo "$package_name version is incorrect: $installed_version (expected: $expected_version)"13 exit 114 fi15}1617# Example checks18check_version "transformers" "$VERSION"19check_version "torch" "$TORCH_VERSION"
This script will ensure that the versions of transformers
, torch
, and any other critical packages match the expected versions.
Integrate the Test Script into the Devcontainer
In the devcontainer-feature.json
file, which you have configured earlier in Step 2, add the following postCreateCommand
at the end of the file to run the test automatically after the container is created:
1"postCreateCommand": "bash /workspaces/src/huggingface/test/version_check.sh"
This configuration will automatically run the version_check.sh script after the Devcontainer is created, ensuring that all installations are correct and up-to-date.
Running the Test
You can also test locally using the Dev Container CLI:
1devcontainer features test -f huggingface -i mcr.microsoft.com/devcontainers/base:ubuntu .
Ensure all installations are completed successfully and that the Hugging Face environment is functional.
Step 4: Automated Testing
To automate testing whenever changes are pushed to your repository, set up GitHub Actions. Automated tests help catch issues early in the development process, ensuring that your feature remains reliable as it evolves.
Note
Never skip the testing phase; it ensures reliability and functionality. Testing not only validates your code but also builds confidence in the stability of the feature.
Step:5 Publish the Feature
Once you have completed the feature and tested it, publish it to your repository. This makes the feature to publish it using GitHub Actions. This process will make your feature available for others to use and contribute to.
Push Changes to GitHub
Use the following commands to push the developed feature into your local repository.
1git add .2git commit -m "Add Hugging Face devcontainer feature"3git push origin main
Run the Workflow
Navigate to the
Actions
tab in your GitHub repository.Run the
Release dev container features & Generate Documentation
workflow. This workflow will package your feature and update the documentation automatically
Set Visibility
Ensure that the package visibility is set to public in the repository's Packages
settings. This step is essential for making your feature accessible to the community.
Examples of Great Devcontainer Features
To inspire your work, here are a few examples of highly effective Devcontainer features:
Github CLI: A feature that sets up the GitHub CLI in Devcontainer, making it easy to manage GitHub repositories from within your containerized environment.
Kubectl-Helm-Minikube: This feature installs the latest version of kubectl, Helm, and optionally minikube. Auto-detects latest versions and installs needed dependencies.
Docker-in-Docker: This feature enables Docker to run inside a container, allowing you to test Dockerized applications from within your Devcontainer.
Node.js: A feature that installs Node.js along with npm or Yarn, enabling JavaScript and TypeScript development in a consistent environment.
Python: Sets up Python with popular tools like pip, Poetry, and venv, ensuring a reliable Python environment for development.
For more features available refer here
Conclusion
In this guide, we have walked through the essential steps to create a Devcontainer feature, focusing on setting up a Hugging Face environment and emphasizing the importance of automating and simplifying the setup process for complex tools used in ML and NLP. From forking the starter repository to implementing the install.sh
script, we have covered how to define metadata, configure options, and test the feature locally. These steps ensure that your feature is both robust and flexible, meeting the needs of various development scenarios.
One of the key takeaways from this guide is the importance of customization and testing. By providing users with configurable options, such as selecting specific versions of libraries or enabling CUDA support, you are making the feature adaptable to different project requirements. Moreover, testing your feature both locally and through automated CI pipelines ensures reliability and helps catch potential issues early, contributing to a smoother development experience for your users.
As you move forward, remember that Devcontainer features are a powerful tool for standardizing development environments across teams. By creating your own features, you can streamline the setup process, reduce the chances of misconfiguration, and ultimately increase productivity. Now that you have the knowledge and tools, consider contributing back to the community by sharing your features, or explore the many existing features available here to enhance your own development workflow.
You can find the complete implementation of the Hugging Face Devcontainer feature in the Hugging Face Devcontainer Feature Repository. This repository includes all the scripts and configurations discussed in this guide, allowing you to easily clone and adapt the feature for your own projects.