Tips on selecting the best platform for Jupyter Notebook according to your project needs.

Image for post
Image for post
Photo by Mitchell Luo on Unsplash

In my previous article, I shared about AI Platform Notebook, which is a cloud computing service provided on Google Cloud Platform. Before I found out the AI Platform Notebook, I either use Google Colab or the GCP Compute Engine Virtual Machine Instances when I needed cloud service to run my Jupyter Notebook. Every service has its Pros and Cons. In this article, I will share my little thoughts and experiences when using these cloud computing services to run my projects.

Getting Ready for Jupyter Notebook


Run Jupyter Notebook on the machine set up exactly according to your needs, IN FEW CLICKS.

Image for post
Image for post
Photo by Dallas Reedy on Unsplash

How do you run your Jupyter Notebook on cloud instance?

Google Colab or other virtual machines provided by cloud computing services providers?

I believe you face the scenario before if you used the two type of services mentioned above.

  1. You running a huge model, and your Google Colab out of run time or stopped due to RAM full.
  2. You failed to configure the complicated VM instances.
  3. You succeed to configure the VM instances, wasted half days on reinstalling the environment, and all the libraries and dependencies needed by your projects.

Well! You are safe from all the problems now!

This wonderful creation, GCP AI Platform Notebook is a Jupyter Notebook on Google Cloud Platform, with all the popular libraries and dependencies like pandas, scikit-learn and TensorFlow preinstalled. …


A deep dive into the security issues occur in HDFS structure, and the available technologies to protect it.

Image for post
Image for post
Photo by Liam Tucker on Unsplash

I. Introduction

Big data is trending. Smart devices, Internet and technologies allowed the unlimited generation and transmission of data, and from the data, new information is gained. The big data generated are in various form, it can be structured, semi-structured or unstructured data. The traditional data processing techniques like Relational Database Management System (RDBMS) are no longer capable to store or process the big data, as it has wide variety, extremely large volume, and generated at a high speed. Here’s where Hadoop come into the loop. …


How are text analytics used in Industry 4.0?

Image for post
Image for post
Photo by NASA on Unsplash

I. Overview

Natural Language Processing (NLP) is described as an application and research area that study how computers and learn and exploit natural language text or speech to create meaningful stuff [1]. In order to achieve human-like language processing for a variety of tasks or applications, NLP as a theoretically inspired set of computational techniques for the analysis and representation of naturally occurring texts at one or more levels of linguistic analysis is necessary [2]. The term NLP is typically used to describe the role of computer system components, software or hardware that analyses or synthesize spoken or written language [3].

Text analytics, sometimes referred to as text mining or data mining, is the process of acquiring insightful information from text. Text mining is an important research topic because most of the data is unstructured. Text mining plays an important role to extract meaningful information from the data by identifying and exploring the interesting patterns [4]. Sentiment Analysis is a natural language processing technique for text analytic, which are used to analyze the polarity of a document or a sentence, or an attribute [5]. …


Image for post
Image for post
Photo by C Dustin on Unsplash

Making Sense of Big Data

Abstract — This paper is to compare the cloud computing services provided by Cloudera, Amazon Web Services and Microsoft Azure. Big data is about large volume data and contain structured, semi-structured and unstructured data. Big data is not able to be stored and processed by conventional technologies. Hadoop framework enable storage and process of such complex data. Cloudera, Amazon Web Services and Microsoft Azure deployed Hadoop framework and enable data storage and process on cloud. All three distributions provide cloud computing, cloud storage, databases and Machine Learning. They all have their own strength and weakness in different aspects. …

About

KahEm Chu

Passionate in Data Science Path. Currently studying Data Science and Analytics in USM, and wish to share some of my works here =]

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store