Skip to main content

Towards a cloud native Jupyter

· 4 min read
Eric Charles
Datalayer Founder

All Data Scientists know that story... Install the well-known Jupyter Classic or JupyterLab Notebook on their local PC/laptop, pip install some python libraries like pandas..., download some datasets and finally start analysing with a notebook in isolation. There are a few pain points there:

  1. Setting up the tools is hard and time consuming. You have to install Python, Jupyter and add the libraries you need. Conda environments or Docker containers can help mitigate the pain at some point, but finally these are yet additional tools to setup and manage.
  2. At some point, they want to collaborate with teammates, or want to share some results. The Data Scientist is just on his island and has no easy way to break the silo. The recent Realtime collaboration features have been merged into JupyterLab but it is just the permises and miss fundamental building blocks like identity, authorization...
  3. The analysis is not easily reproducible. The setup you have done on a particular Windows platform is completely different from the setup another collaborator may have done on macOS.

More Cloud-native

There comes the need for an better solution. At Datalayer we think that a more Cloud-native Jupyter can help remove those pain points. In other words, we embrasse the infrastructure provided by cloud providers like GCloud, AWS, Azure... and build on top to provide more power to the Data Scientist.

Cloud native computing is an approach in software development that utilizes cloud computing to "build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds.

Wikipedia https://en.wikipedia.org/wiki/Cloud_native_computing

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Crossplane by example on GCP

· 4 min read
Eric Charles
Datalayer Founder

Crossplane is an open source Kubernetes add-on that enables platform teams to assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume, without having to write any code. It allows you to compose cloud infrastructure and services based on XRD (cross resource definitions) that extends the existing Kubernete CRD (Custom Resource Definition). To achieve this awesome goal, you have to use various repositories that reside in the GitHub crossplane, crossplane-contrib and upbound organisations. As adaptor of that new technology, you can rely the official documentation where a lot of details are gathered.

To ease our understanding and document our experiments, we have created a crossplane-example repository that will take you step-by-step to use Crossplane to deploy your infrastructure on top of Google Cloud and also develop a user interface and Helm chart that access a database created by Crossplane.

users

Crossplane community is welcoming, just like the Crossplane logo is fun!

crossplane

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

A new start with Jupyter

· 2 min read
Eric Charles
Datalayer Founder

Since our last blog post on January 2018, we have changed a lot the Datalayer architecture. Back in 2018, we had chosen for Apache Zeppelin for its good integration with Big Data frameworks like Apache Spark and competely replaced the existing Angular.js user interface with a home-brewed React.js implementation to integrate with the Kubernetes Control Plane. While rolling out more and more features on top of our former version 0.0.1, we have been intrigued in February 2018 by JupyterLab being announced to be ready for users. Back in time, in July 2016, JupyterLab was positioned as the next generation of the Jupyter Notebook.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Datalayer 0.0.1 on Kubernetes

· 2 min read
Eric Charles
Datalayer Founder

Building a complete scalable Data Science Showcase on Kubernetes is another piece which is more challenging to achieve. The Datalayer Science Showcase is designed to be Simple, Collaborative and Multi Cloud and is particulary suited for Data Science exploration teams.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits

Datalayer Notebook for Big Data Scientists on Azure

· 2 min read
Eric Charles
Datalayer Founder

Datalayer today announced the integration of the Datalayer WEB Notebook for big data scientists with Microsoft OneNote. We also announced that Datalayer WEB Notebook will be deployed on Microsoft Azure. This integration is available for Windows Live, as Office 365 users via the Datalayer WEB Notebook. This authentication can be used today by the Data Scientists to publish their data analysis in the Microsoft OneNote online service, more easily read and accessible to the Business stakeholders.

Working with Microsoft

"Datalayer's offering is bridging the gap between science and business, and fosters business communication. Utilizing Microsoft Azure, Datalayer is offering their customers the opportunity to better communicate and work with their data-driven strategy", said Nicole Herskowitz, Senior Director of Product Marketing, Microsoft Azure, Microsoft Corp.

Datalayer: Accelerated and Trusted JupyterRegister and get free credits