4 posts tagged with "datalayer"

Datalayer Joins NVIDIA Inception

October 16, 2024 · 3 min read

Product Manager

Datalayer has joined NVIDIA Inception, a program designed to nurture startups that are revolutionizing industries with technological advancements.

At Datalayer, we are focused on providing seamless access to powerful Remote Kernels for data scientists, AI engineers, and machine learning practitioners. Our mission is to simplify workflows and boost productivity by allowing users to leverage GPUs and CPUs without altering their existing code or preffered tools.

Joining NVIDIA Inception will accelerate our development by providing access to industry-leading resources such as go-to-market support, technical assistance and training. This will help us enhance our solutions and collaborate with a network of AI-driven organizations and experts, driving growth during critical stages of product development and enabling us to better serve our users.

Before joining, we were already big fan and users of the NVIDIA GPU technology, with the GPU Kubernetes Operator as documented on the Datalayer Tech GPU CUDA page. We have been supporting Time Slicing and MIG to help optimize costs for our users. We are eager to collaborate with NVIDIA experts to further reduce expenses while enhancing security through sandboxed solutions such as KubeVirt and Kata Containers.

Stay tuned as we continue to develop innovative solutions, now with the support of the NVIDIA Inception Program. We are excited to share our progress with you in the coming months! In the meantime, you can already experiment with NVIDIA GPU on Datalayer.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Datalayer Private Beta

October 10, 2024 · 4 min read

Eléonore Charles

Product Manager

We are super excited to announce that Datalayer is entering Private Beta! After months of development, we are inviting today those who signed up on our waiting list to experience our solution first-hand.

How to Join the Beta?

If you registered on our waiting list, keep an eye on your inbox, invitations are being sent out now! We're thrilled to have you onboard as part of this exclusive group, helping us shape the future of Datalayer.

But don't worry if you haven't signed up yet—there are still limited spots available. Simply register on the waiting list to secure your spot in the private beta.

Why Join the Beta?

This is your opportunity to get early access to the cutting-edge features of Datalayer, and we need your help to make it even better. Your experience and feedback will be invaluable in helping us fine-tune the product, optimize performance, and add features that truly meet your needs. It would be great to have you on board and we can't wait to hear your thoughts!

As a beta user, you'll enjoy:

Free credits to try out Remote Kernels.
Direct support from our team to ensure a smooth experience.
Directly influence the future development of Datalayer through your feedback.

What Can Datalayer Bring You?

Datalayer simplifies access to powerful computing resources (GPU or CPU) for data scientists and AI engineers. Whether you're training models or running large-scale simulations, you can seamlessly scale your workflows without changing your code or setup.

Key Benefits

Effortless Remote Kernel Access: Seamlessly connect to high-performance Remote Kernels from JupyterLab, VS Code, or via the CLI. Switch kernels with just a few clicks to run your code on powerful machines, without altering your workflow or setup.
Flexible and Simple Setup: Avoid the complexity of configuration changes or workflow disruption. Launch Remote Kernels effortlessly and scale your data science or AI workflows with ease, whether you're working on notebooks or scripts.
Optimized Resource Usage: Gain control over resource allocation by running specific notebook cells on Remote Kernels only when needed. This precision helps minimize resource consumption and maximize efficiency.
Flexible Credits-Based Model: Enjoy a pay-as-you-go credits system that adapts to your needs. With transparent usage tracking and detailed reports, you'll only pay for the resources you use, making it a cost-effective solution for scaling your projects.

Learn more about Datalayer's features on our user documentation and online SaaS.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

GPU Acceleration for Jupyter Cells

August 23, 2024 · 7 min read

Eléonore Charles

Product Manager

In the realm of AI, data science, and machine learning, Jupyter Notebooks are highly valued for their interactive capabilities, enabling users to develop with immediate feedback and iterative experimentation.

However, as models grow in complexity and datasets expand, the need for powerful computational resources becomes critical. Traditional setups often require significant adjustments or sacrifices, such as migrating code to different platforms or dealing with cumbersome configurations to access GPUs. Additionally, often only a small portion of the code requires GPU acceleration, while the rest can run efficiently on local resources.

What if you could selectively run resource-intensive cells on powerful remote GPUs while keeping the rest of your workflow local? That's exactly what Datalayer Cell Kernels feature enables. Datalayer works as an extension of the Jupyter ecosystem. With this innovative approach, you can optimize your cost without disrupting your established processes.

We're excited to show you how it works.

The Power of Selective Remote Execution

Datalayer Cell Kernels introduce a game-changing capability: the ability to run specific cells on remote GPUs while keeping the rest of your notebook local. This selective approach offers several advantages:

Cost Optimization: Only use expensive GPU resources when absolutely necessary.
Performance Boost: Accelerate computationally intensive tasks without slowing down your entire workflow.
Flexibility: Seamlessly switch between local and remote execution as needed.

Let's dive into a practical example to see how this works. We'll demonstrate this hybrid approach using a sentiment analysis task with Google's Gemma-2 model.

Create the LLM Prompt

We start by creating our prompt locally. This part of the notebook runs on your local machine:

prompt = """
Analyze the following customer reviews and provide a structured JSON response for each review. Each response should contain:

- "review_id": A unique identifier for each review.
- "themes": A dictionary where each key is a theme or topic mentioned in the review, and each value is the sentiment associated with that theme (positive, negative, or neutral).

Format your response as a JSON array where each element is a JSON object corresponding to one review. Ensure that the JSON structure is clear and easily parseable.

Customer Reviews:

1. "I love the smartphone's performance and speed, but the battery drains quickly."
2. "The smartphone's camera quality is top-notch, but the battery life could be better."
3. "The display on this smartphone is vibrant and clear, but the battery doesn't last as long as I'd like."
4. "The customer support was helpful when my smartphone had issues with the battery draining quickly. The camera is ok, not good nor bad."

Respond in this format:
[
    {
        "review_id": "1",
        "themes": {
            "...": "...",
            ...
        }
    },
    ...
]
"""

Analyse Topics and Sentiment on Remote GPU

Now, here's where we leverage the remote GPU. This cell contains the code to perform sentiment analysis using the Gemma-2 model and the Hugging Face Transformers library. We'll switch to the Remote Kernel for just this cell:

from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Login to Hugging Face
login(token="HF_TOKEN")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

# Prepare the prompt
chat = [{"role": "user", "content": prompt},]

# Generate the prompt and perform inference
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=2000)

# Decode the response, excluding the input prompt from the output
prompt_length = inputs.shape[1]
response = tokenizer.decode(outputs[0][prompt_length:])

By executing only this cell remotely, we're optimizing our use of GPU resources. This targeted approach allows us to tap into powerful computing capabilities precisely when we need them, without the overhead of running our entire notebook on a remote machine.

To execute this cell on a remote GPU, you just have to select the remote environment for this cell.

This is done with just a few clicks, as shown below:

With a simple selection from the cell dropdown, you can seamlessly transition from local to remote execution.

info

Using a Tesla V100S-PCIE-32GB GPU, the sentiment analysis task completes on average in 10 seconds. The number of tokens/seconds processed is ± 19.

The model was pre-downloaded in the remote environment. This was done to eliminate download time. Datalayer lets you customize your computing environment to match your exact needs. Choose your hardware specifications and install the libraries and models you require.

Datalayer Cell Kernels allow you to manage variable transfers between your local and remote environments. You can easily configure which variables should be passed from your local setup to the Remote Kernel and vice versa, as illustrated below:

This ensures that your remote computations have access to the data they need and that your local environment can utilize the results of remote processing.

info

Variable transfers are currently limited in practice to 7 MB of data. This limit is expected to increase in the future, and the option to add data to the remote environment will also be introduced.

To help you monitor and optimize your resource usage, Datalayer provides a clear and intuitive interface for viewing Remote Kernel usage.

Process and Visualize Results Locally

We switch back to local execution for processing and visualizing the results. This is the processed list of themes and sentiments extracted from the reviews by the Gemma-2 model:

[
    {
        'review_id': '1',
        'themes': {'performance': 'positive', 'speed': 'positive', 'battery': 'negative'}
    }, 
    {
        'review_id': '2',
        'themes': {'camera': 'positive', 'battery': 'negative'}
    },
    {
        'review_id': '3', 
        themes': {'display': 'positive', 'battery': 'negative'}
    },
    {
        'review_id': '4',
        'themes': {'customer support': 'positive', 'camera': 'neutral', 'battery': 'negative'}
    }
]

And below is a visualization of the theme and sentiment distribution across the reviews:

Key Takeaways

Datalayer Cell Kernels allow you to selectively run specific cells on remote GPUs. This hybrid approach optimizes both performance and cost by using remote resources only when necessary. Complex tasks like sentiment analysis with large language models become more accessible and efficient.

Check out the full notebook example and sign up on the Datalayer waiting list today and be among the first to experience the future of hybrid Jupyter workflows!

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Remote Kernels Preview

November 11, 2023 · 4 min read

Eric Charles

Datalayer Founder

info

First things first, what is a Jupyter Kernel?

A Jupyter Kernel is the place where the computation of your Jupyter Notebook is happening. A Kernel is separated from the Notebook, hence can run your code remotely on a different system.

Install Datalayer

Datalayer is a JupyterLab extension. To install it, just run the following command in your terminal.

pip install datalayer

You will need python>=3.9 and pip available on your machine.

info

Remote Kernels is being released in a PREVIEW mode. This means that the account you will create will not stay and can be removed at any time based on Datalayer's new releases.

JupyterLab Launcher

Launch JupyterLab as usual.

jupyter lab

JupyterLab users are used to go the their launcher which present typical tiles to create a Notebook and launch a Kernel.

Datalayer introduces a new element at the top of the JupyterLab launcher.

Account

The first step is to authenticate.

If this is your fist contact with Datalayer, you will need an account. Just fill in a few details and check your mailbox for the confirmation code.

Serverless

Once authenticated, Datalayer takes care of the rest and will create the needed services for you in its own infrastructure.

You don't have to worry on anything, just wait on the green light that should appear on your Home page.

Kernels

Once the services are available, it may take a bit of time to have your kernels up-and-running. For now, we offer you 3 differents Remote Kernels.

The Home page also list your local machine Kernels, and will offer in next releases the ability to create local browser Kernels.

Remote Kernels

Remote Kernels creates for now predefined Remote Kernels from your local JupyterLab.

Notebooks

To ease the onboarding, you can create example of Notebooks clicking on the Example buttons.

This step is of course completely optional and you are welcome to directly use your own Notebooks.

You can use the Kernels from the standard JupyterLab kernel picker.

Click on the top-right picker of the Notebook, and assign a Kernel to Notebook (the Remote Kernels are listed at the top).

Local Files

info

The Local Files access feature is highly experimental.

You need a local SSH Server.
Once a folder is mounted, you'd better restart your server to unmount it (we are working on a better implementation).
Windows is not supported for now.
ssh from you local machine on your user account has to work without prompt

To mount your Local Files to the Remote Kernel, a SSH Server must be running on your local machine (on port 22) and you must be able to connect without password prompt from your local terminal.

# Has to connect without password prompt.
ssh localhost
# ...

Kernel Lifecycle

You can delete a Kernel.

We will support the start as pause of the Kernel.

note

Kernel start and pause is not supported in the current release.

Need Help?

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

How to Join the Beta?​

Why Join the Beta?​

What Can Datalayer Bring You?​

Key Benefits​

The Power of Selective Remote Execution​

Create the LLM Prompt​

Analyse Topics and Sentiment on Remote GPU​

Process and Visualize Results Locally​

Key Takeaways​

Install Datalayer​

JupyterLab Launcher​

Account​

Serverless​

Kernels​

Remote Kernels​

Notebooks​

Local Files​

Kernel Lifecycle​

Need Help?​

How to Join the Beta?

Why Join the Beta?

What Can Datalayer Bring You?

Key Benefits

The Power of Selective Remote Execution

Create the LLM Prompt

Analyse Topics and Sentiment on Remote GPU

Process and Visualize Results Locally

Key Takeaways

Install Datalayer

JupyterLab Launcher

Account

Serverless

Kernels

Remote Kernels

Notebooks

Local Files

Kernel Lifecycle

Need Help?