Process Web Logs with AWS Data Pipeline, Amazon EMR, and Hive

In this video, you will learn how to use AWS Data Pipeline and a console template to create a functional pipeline.

The pipeline uses an Amazon EMR cluster and a Hive script to read Apache web access logs, select certain columns, and write the reformatted output to an Amazon S3 bucket.

Learn more about AWS Data Pipeline at

About The Author
- Launched in 2006, Amazon Web Services offers a robust, fully featured technology infrastructure platform in the cloud comprised of a broad set of compute, storage, database, analytics, application, and deployment services from data center locations in the U.S., Australia, Brazil, China, Germany, Ireland, Japan, and Singapore. More than a million customers, including fast-growing startups, large enterprises, and government agencies across 190 countries, rely on AWS services to innovate quickly, lower IT costs and scale applications globally. To learn more about AWS, visit

Tell us what you think...