I'm Aaron Li

Data Scientist | Software Engineer

About me

>>> ~ aaronleeiv$ uname

Cheng (Aaron) Li

Data Scientist | Software Engineer

After came to CMU on 2013 with a bachelor of Aerospace Engineering, I realized my passion about applying machine learning to daily life. It would be so nice to live in a world where every object is as smart as my fuzzy logic rice cooker, not my smart phone. That said they should be able to make decision for unseen scenarios, collaboratively, offline.

After a ton of Machine Learning/Computer Vision work in Mechanical Engineering at CMU, I started my 2nd master major in Entertainment Technology to see how machine learning can be effectively applied to and effect films and video games.

On Summer 2015, I got really inspired by Mickey McManus's article "Nature of Things" and then collaborated with him at Autodesk Inc. on a unique research project: Project Primordial.

I finally find my ultimate career goal and the first step is being a Data Scientist with specialty in Natural Language Processing and Internet of Things.

Skills

>>> ~ aaronleeiv$ dir(skill)

Python

Highly skilled

Java

Highly skilled

MapReduce

Thoroughly familiar

Hadoop

Proficient in

Spark

Proficient in

SQL / PIG

Capable in

Experience

>>> ~ aaronleeiv$ history | experience

Jun,2016
Present

Hulu

Software Developer - Big Data Platform | Java, Python, Scala

- Developed, scaled, and improved the data platform for the entirety of Hulu
- Contributed to the design, architecture and implementation of a data-engineering infrastructure
- Implemented solutions using Hadoop/Hbase/Hive and help to contribute back to the open source community
- Invented novel solutions to challenging technical problems
- Diagnose and debug issues in production environments
- Collaborated with researchers, program managers, and product designers in an open, creative environment

Jan,2016
May,2016

Carnegie Mellon University

NLP Specialist | Java, Python

- Implemented a Spell Check System based on shortest Damerau–Levenshtein distance and trie
- Implemented a Question Asking/Answering System with nltk for Wikipedia articles

Sep,2015
Jan,2016

Carnegie Mellon University

Machine Learning Programmer | Java, Python

- Analyzed aricles from dbpedia with Hadoop by Streaming Naive Bayes on AWS elastic MapReduce, by Logistic Regression with Stochastic Gradient Descent, by data flow language GuineaPig
- Analyzed RCV1 datasets by Distributed Stochastic Gradient Descent on Spark
- Implemented a graph sampling application based on Efficient Approximate PageRank

May,2015
Aug,2015

Autodesk Inc.

Intern Design | C#/JavaScipt, Python, Unity

- Designed and built a Data Acquisition System with other designers, fabricators, mechanical engineers and electrical engineers
- Collected and analyzed 4.5B datapoints to understand the relation between flow of the driver and other factors related motion/force and enviroment
- Developed a car driving simulation game powered by goal-directed generative design and a google cardboard VR experience

  Video from Autodesk University

  Article from Fast Company

Jan,2014
May,2014

Carnegie Mellon University

Machine Learning Programmer | Python

- Implement a decision tree learner to predict whether a song made itself onto the Billboard Top 50 with all attributes converted to binary based on largest entropy gain at each split
- Implemented a neural network with arbitrary hidden layers to predict whether a song was a "hit" using the same data for decision tree without binarization of attributes
- implement a Naive Bayes classifier for document classification
- Implemented an Hidden Markov Model and its associated algorithms for the task of Part-of-Speech tagging, including The Forward And Backward Algorithm, The Viterbi Algorithm and The Baum-Welch Algorithm

Education & Diplomas

>>> ~ aaronleeiv$ history | education

May, 2016

Master of Entertainment Technology

Carnegie Mellon University, School of Computer Science & College of Fine Arts

RELEVANT COURSES:
Machine Learning with Large Datasets, Natural Language Processing, Principles of Software System Construction, Distributed System, Technical Animation, Visual Story

¶ 1st Price Microsoft Student //GameOn Context

May, 2014

Master of Mechanical Engineering

Carnegie Mellon University, Carnegie Institute of Technology

RELEVANT COURSES:
Machine Learning, Algorithm & Advanced Data Structure for Scientst, Computer Vision, Engineering Computation, Mathematical Techniques in Engineering, Probability and Estimation Methods for Engineering Systems

May, 2013

Bachelor of Aircraft Design and Engineering

Beijing University of Aeronautics and Astronautics

RELEVANT COURSES:
Advanced Algebra, Mathematical Analysis, Probability and Statistics, Programming Languages, Numerical Methods

Portfolio

>>> Portfolio aaronleeiv$ ls

More projects are being composed to posts...

Get In Touch

>>> ~ aaronleeiv$ more

2500 Broadway, Santa Monica, CA, 90404