Big Data Spain

17th ~ 18th NOV 2016 MADRID, SPAIN #BDS16

Big Migrations: Moving elephant herds

Thursday 17th

from 14:50 to 15:30

Theatre 20



Deploying new services in cloud is easy, fast and does not require a huge initial investment (not to mention that it is COOL and everyboy does it nowadays). With the availability of tools for quick deployments and DevOps techniques, it is easier than ever to deploy your very own Hadoop cluster and begin harnessing the power of Big Data.

However, what happens when your proof of concept (that has since its debut grown to a multiple terabyte monster and is ingesting several data streams in real time) needs to go into production? Does your cloud vendor provide you with enough performance? Do you need to add extra capacity to maintain the required service level? Can you do so while keeping costs below a certain threshold?

You may reach a point where you want to switch cloud vendors to save some money. Have you thought about how much the transition itself will cost you? Traditional approaches to data migration may take several days or weeks given the volumes that Big Data handles. If you take into account the time that your business or service will be unavailable, coupled with the cost of the technical team that is needed to plan and execute this migration, your expected savings may decrease significantly or even disappear.

What if you need to consolidate your data in one datacenter to follow corporate policy? What if you need to add data to your cluster that can't be legally stored in public cloud? You don't want to rebuild your solution from scratch and have to go through the same data load process that you already had when moving from traditional data warehousing solutions to a Hadoop cluster.

In this workshop we will give you tips and best practices on how to move big volumes of data efficiently and with minimal downtime. There are different techniques available, what are their advantages and disadvantages? What steps do you need to follow? Deploying Hadoop was easy, maintaining it is a little bit more tricky and migrating this type of platform requires a thorough knowledge of Hadoop and its related ecosystem tools. A lot of factors must be taken into account to ensure a successful data migration: application activity periods, network bandwidth, Hadoop replication, cluster management tools such as Cloudera Manager, DNS, application endpoints and many other small details that we will show you during this workshop.

We will guide you through this process in a way that you can ensure maximum service availability and data integrity.

Carlos Izquierdo foto

Carlos Izquierdo

DatatonsBig Data architect