Migrating Big Data Workloads to Amazon EMR – June 2017 AWS Online Tech Talks

Learning Objectives:
– Identify best practices to migrate from on-premise deployments to Amazon EMR
– Understand hot to move data from HDFS to Amazon S3
– Use Amazon EC2 Spot instances to save costs

Businesses are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to Amazon EMR in order to save costs, increase availability, and improve performance. Amazon EMR is a managed service that lets you process and analyze extremely large data sets using the latest versions of over 15 open-source frameworks in the Apache Hadoop and Spark ecosystems. This tech talk will focus on identifying the components and workflows in your current environment and providing the best practices to migrate these workloads to Amazon EMR. We will explain how to move from HDFS to Amazon S3 as a durable storage layer, and how to lower costs with Amazon EC2 Spot instances and Auto Scaling. Additionally, we will go over common security recommendations and tuning tips to accelerate the time to production. amazon web services, aws, cloud computing, webinar

About The Author
- Launched in 2006, Amazon Web Services offers a robust, fully featured technology infrastructure platform in the cloud comprised of a broad set of compute, storage, database, analytics, application, and deployment services from data center locations in the U.S., Australia, Brazil, China, Germany, Ireland, Japan, and Singapore. More than a million customers, including fast-growing startups, large enterprises, and government agencies across 190 countries, rely on AWS services to innovate quickly, lower IT costs and scale applications globally. To learn more about AWS, visit http://aws.amazon.com.

Tell us what you think...