Apache Avro: Data Serialization for Distributed Applications Training Course

Course Code

IntroToAvro

Duration

14 hours (usually 2 days including breaks)

Requirements

  • A general familiarity with distributed computing

Overview

This course is intended for

  • Developers

Format of the course

  • Lectures, hands-on practice, small tests along the way to gauge understanding

Course Outline

Principles of distributed computing

  • Apache Spark
  • Hadoop

Principles of data serialization

  • How data object is passed over the network
  • Serialization of objects
  • Serialization approaches
    • Thrift
    • Protocol Buffers
    • Apache Avro
      • data structure
      • size, speed, format characteristics
      • persistent data storage
      • integration with dynamic languages
      • dynamic typing
      • schemas
        • untagged data
        • change management

Data serialization and distributed computing

  • Avro as a subproject of Hadoop
    • Java serialization
    • Hadoop serialization
    • Avro serialization

Using Avro with

  • Hive (AvroSerDe)
  • Pig (AvroStorage)

Porting Existing RPC Frameworks

Testimonials

★★★★★
★★★★★

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

is growing fast!

We are looking to expand our presence in Vietnam!

As a Business Development Manager you will:

  • expand business in Vietnam
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!