Skip to main content

3 posts tagged with "gpu"

View All Tags

Datalayer adding GPU to Anaconda Notebooks

· 6 min read
Eléonore Charles
Product Manager

We are thrilled to announce our collaboration with Anaconda, a leader in Data Science and AI platforms. This partnership marks a step forward in our mission to democratize access to high-performance computing resources for Data Scientists and AI Engineers.

Anaconda offers Anaconda Notebooks, a cloud-based service that allows data scientists to use Jupyter Notebooks without the hassle of local environment setup. Through our collaboration, we are enhancing this platform with Datalayer's Remote Runtime technology, bringing seamless GPU access directly to Anaconda Notebooks users.

Why Remote Runtimes and GPUs Matter

In traditional Jupyter Notebook setups, all computations occur locally on a user's machine or a cloud instance. While this setup works well for small to medium-sized tasks, scaling these tasks to handle massive datasets, complex deep learning models, or resource-intensive simulations requires more powerful hardware, such as Graphics Processing Units (GPUs).

GPUs are game-changers for data science and AI because they can parallelize computations, drastically speeding up processes like neural network training, image processing, and large-scale data analytics. However, setting up a local or cloud environment with GPU support can be technically challenging and time-consuming, especially for non-experts.

By upgrading Anaconda Notebooks with Datalayer's Remote Runtime technology, the heavy lifting is done behind the scenes, allowing Anaconda users to focus on what matters most: their data science tasks.

How Datalayer Supercharges Anaconda Notebooks

One of the core advantages of Anaconda Notebooks is its ease of use. Users can quickly launch Jupyter Notebooks with all the libraries and environments they need without the hassle of local configuration. The collaboration with Datalayer builds on this strength, making it incredibly easy for Anaconda Notebooks users to access remote GPU-powered Runtimes.

Users can launch GPU Runtimes directly from the Anaconda Notebooks Jupyter Launcher and switch their Jupyter Notebook to a GPU Runtime with a single click.

info

Anaconda Notebooks is running on an Anaconda managed JupyterHub while Datalayer Runtimes are running on a separated Kubernetes cluster with IAM (Identity and Access Management) and Credits (Usage) integrations.

Architecture Diagram

Benefits for Anaconda Notebooks Users

The collaboration between Datalayer and Anaconda offers several key benefits to the platform's existing and future user base:

  • Enhanced Performance: Users now have access to powerful GPUs without having to manage the underlying infrastructure. This enhancement translates to faster computations and the ability to handle more complex tasks.

  • Cost-Effective Scaling: By leveraging Remote Runtimes, users only consume GPU resources when needed. They can switch between CPU and GPU Runtimes based on the task, optimizing both performance and cost.

  • User-Friendly: The familiar Anaconda Notebooks interface remains the same, with the added option of GPU Runtimes. No additional learning curve or configuration is required, making it accessible even for non-technical users.

  • Broader Use Cases: With GPU support, Anaconda Notebooks users can now tackle a wider range of projects. From deep learning models and complex simulations to high-dimensional data processing, the possibilities have expanded dramatically.

Datalayer provides one-click access to scalable GPU infrastructure, enabling Data Scientists and AI Engineers at all levels to run even the most advanced AI and ML tasks, integrated with the Jupyter Notebook where they are already working.

Jack EvansSr. Product Manager

For any Business in a Whitelabelled Variant

The Datalayer Runtimes are available for any company in a whitelabelled variant.

Integrating managed deployment of Datalayer with your existing Jupyter solution brings a significant advantage for its operators: it allows quick and straightforward installation of a JupyterLab extension and services on Kubernetes, without requiring additional development. This streamlines operations and enables operators to focus on managing the infrastructure, free from the complexities of configuration.

Reach out for more information on how to integrate Datalayer on you Kubernetes cluster and add Runtimes to your existing Jupyter solution.

Conclusion

Our partnership with Anaconda puts the power of high-performance computing at the fingertips of the Anaconda users, while preserving the simplicity and ease of use that Anaconda Notebooks is known for. This collaboration goes beyond simply boosting computational power; it democratizes access to essential tools, empowering Data Scientists and AI Engineers around the world to achieve more, faster, and with greater efficiency. By breaking down barriers, Anaconda and Datalayer are enabling Data Scientists and AI Engineers to unlock their full potential, paving the way for new innovations.

That Beta availabilty was announced at the latest NVIDIA GTC event. Looking ahead, we plan to refine this solution further by enhancing the user interface and incorporating feedback from early users. Additionally, we aim to integrate the GPU Runtime feature into the Anaconda Toolbox.

To learn how to access this feature, visit the official Anaconda GPU Runtimes documentation as well as this Anaconda blog post.

You can register on the Beta waiting list via this link.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Datalayer Joins NVIDIA Inception

· 3 min read
Eléonore Charles
Product Manager

Datalayer has joined NVIDIA Inception, a program designed to nurture startups that are revolutionizing industries with technological advancements.

At Datalayer, we are focused on providing seamless access to powerful Remote Kernels for data scientists, AI engineers, and machine learning practitioners. Our mission is to simplify workflows and boost productivity by allowing users to leverage GPUs and CPUs without altering their existing code or preffered tools.

Joining NVIDIA Inception will accelerate our development by providing access to industry-leading resources such as go-to-market support, technical assistance and training. This will help us enhance our solutions and collaborate with a network of AI-driven organizations and experts, driving growth during critical stages of product development and enabling us to better serve our users.

Before joining, we were already big fan and users of the NVIDIA GPU technology, with the GPU Kubernetes Operator as documented on the Datalayer Tech GPU CUDA page. We have been supporting Time Slicing and MIG to help optimize costs for our users. We are eager to collaborate with NVIDIA experts to further reduce expenses while enhancing security through sandboxed solutions such as KubeVirt and Kata Containers.

Stay tuned as we continue to develop innovative solutions, now with the support of the NVIDIA Inception Program. We are excited to share our progress with you in the coming months! In the meantime, you can already experiment with NVIDIA GPU on Datalayer.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

GPU Acceleration for Jupyter Cells

· 7 min read
Eléonore Charles
Product Manager

In the realm of AI, data science, and machine learning, Jupyter Notebooks are highly valued for their interactive capabilities, enabling users to develop with immediate feedback and iterative experimentation.

However, as models grow in complexity and datasets expand, the need for powerful computational resources becomes critical. Traditional setups often require significant adjustments or sacrifices, such as migrating code to different platforms or dealing with cumbersome configurations to access GPUs. Additionally, often only a small portion of the code requires GPU acceleration, while the rest can run efficiently on local resources.

What if you could selectively run resource-intensive cells on powerful remote GPUs while keeping the rest of your workflow local? That's exactly what Datalayer Cell Kernels feature enables. Datalayer works as an extension of the Jupyter ecosystem. With this innovative approach, you can optimize your cost without disrupting your established processes.

We're excited to show you how it works.

The Power of Selective Remote Execution

Datalayer Cell Kernels introduce a game-changing capability: the ability to run specific cells on remote GPUs while keeping the rest of your notebook local. This selective approach offers several advantages:

  1. Cost Optimization: Only use expensive GPU resources when absolutely necessary.
  2. Performance Boost: Accelerate computationally intensive tasks without slowing down your entire workflow.
  3. Flexibility: Seamlessly switch between local and remote execution as needed.

Let's dive into a practical example to see how this works. We'll demonstrate this hybrid approach using a sentiment analysis task with Google's Gemma-2 model.

Create the LLM Prompt

We start by creating our prompt locally. This part of the notebook runs on your local machine:

prompt = """
Analyze the following customer reviews and provide a structured JSON response for each review. Each response should contain:

- "review_id": A unique identifier for each review.
- "themes": A dictionary where each key is a theme or topic mentioned in the review, and each value is the sentiment associated with that theme (positive, negative, or neutral).

Format your response as a JSON array where each element is a JSON object corresponding to one review. Ensure that the JSON structure is clear and easily parseable.

Customer Reviews:

1. "I love the smartphone's performance and speed, but the battery drains quickly."
2. "The smartphone's camera quality is top-notch, but the battery life could be better."
3. "The display on this smartphone is vibrant and clear, but the battery doesn't last as long as I'd like."
4. "The customer support was helpful when my smartphone had issues with the battery draining quickly. The camera is ok, not good nor bad."

Respond in this format:
[
{
"review_id": "1",
"themes": {
"...": "...",
...
}
},
...
]
"""

Analyse Topics and Sentiment on Remote GPU

Now, here's where we leverage the remote GPU. This cell contains the code to perform sentiment analysis using the Gemma-2 model and the Hugging Face Transformers library. We'll switch to the Remote Kernel for just this cell:

from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Login to Hugging Face
login(token="HF_TOKEN")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Load the model
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-it",
device_map="auto",
torch_dtype=torch.bfloat16,
)

# Prepare the prompt
chat = [{"role": "user", "content": prompt},]

# Generate the prompt and perform inference
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=2000)

# Decode the response, excluding the input prompt from the output
prompt_length = inputs.shape[1]
response = tokenizer.decode(outputs[0][prompt_length:])

By executing only this cell remotely, we're optimizing our use of GPU resources. This targeted approach allows us to tap into powerful computing capabilities precisely when we need them, without the overhead of running our entire notebook on a remote machine.

To execute this cell on a remote GPU, you just have to select the remote environment for this cell.

This is done with just a few clicks, as shown below:

With a simple selection from the cell dropdown, you can seamlessly transition from local to remote execution.

info

Using a Tesla V100S-PCIE-32GB GPU, the sentiment analysis task completes on average in 10 seconds. The number of tokens/seconds processed is ± 19.

The model was pre-downloaded in the remote environment. This was done to eliminate download time. Datalayer lets you customize your computing environment to match your exact needs. Choose your hardware specifications and install the libraries and models you require.

Datalayer Cell Kernels allow you to manage variable transfers between your local and remote environments. You can easily configure which variables should be passed from your local setup to the Remote Kernel and vice versa, as illustrated below:

This ensures that your remote computations have access to the data they need and that your local environment can utilize the results of remote processing.

info

Variable transfers are currently limited in practice to 7 MB of data. This limit is expected to increase in the future, and the option to add data to the remote environment will also be introduced.

To help you monitor and optimize your resource usage, Datalayer provides a clear and intuitive interface for viewing Remote Kernel usage.

Process and Visualize Results Locally

We switch back to local execution for processing and visualizing the results. This is the processed list of themes and sentiments extracted from the reviews by the Gemma-2 model:

[
{
'review_id': '1',
'themes': {'performance': 'positive', 'speed': 'positive', 'battery': 'negative'}
},
{
'review_id': '2',
'themes': {'camera': 'positive', 'battery': 'negative'}
},
{
'review_id': '3',
themes': {'display': 'positive', 'battery': 'negative'}
},
{
'review_id': '4',
'themes': {'customer support': 'positive', 'camera': 'neutral', 'battery': 'negative'}
}
]

And below is a visualization of the theme and sentiment distribution across the reviews:

Key Takeaways

Datalayer Cell Kernels allow you to selectively run specific cells on remote GPUs. This hybrid approach optimizes both performance and cost by using remote resources only when necessary. Complex tasks like sentiment analysis with large language models become more accessible and efficient.

Check out the full notebook example and sign up on the Datalayer waiting list today and be among the first to experience the future of hybrid Jupyter workflows!

Datalayer: Accelerated and Trusted JupyterRegister and get free credits