Accelerating the Pace of AWS Inferentia Chip Development: From Concept to End Customers’ Use

By Bilderback, Dayna

March 9, 2020

Speaker: Randy Huang, Ph.D.
Affiliation: Amazon Web Services (AWS)

Abstract: In this talk, we will detail the process and the decisions we have made to bring AWS Inferentia from a one-page press release to GA (general available). Our process starts with working backward from the customers and how we could bring real benefits to customers’ use cases. We will show that by separating out 1-way vs. 2-way doors decisions, we can navigate technical and strategic decisions at AWS velocity and bring a deep-learning accelerator to the marketplace quickly.

Biography: Randy Huang is the principal engineer and compiler lead for AWS Inferentia, a custom chip designed by AWS. Inferentia was developed to enable highly cost-effective low latency inference performance at any scale. Prior to joining AWS, he led the architecture group at Tabula, designing and building three dimensional field programmable gate arrays (3-D FPGAs). Randy received his Ph.D. from University of California, Berkeley.

At the end of the seminar, technical recruiter Victor Adams will answer questions about openings with Amazon AI chip team, and collect resumes.

For more information, contact Prof. Lei He (lei.hexun@gmail.com)

Date/Time:
Date(s) - Mar 09, 2020
4:00 pm - 6:00 pm

Location:
EE-IV Shannon Room #54-134
420 Westwood Plaza - 5th Flr., Los Angeles CA 90095