An index for my github repositories.


Research paper implementations

  • Linear Contextual Bandits
    • Mostly Exploration-Free Algorithms for Contextual Bandits, Hamsa Bastani, Mohsen Bayati, Khashayar Khosravi
    • Adaptive Exploration in Linear Contextual Bandit, Botao Hao, Tor Lattimore, Csaba Szepesvari


  • Multi-player Multi-armed Bandits
    • Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions, Sébastien Bubeck, Thomas Budzinski, Mark Sellke
    • SIC-MMAB, SIC-MMAB2 and DYN-MMAB algorithms from SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits, Etienne Boursier, Vianney Perchet
    • EC-SIC from Decentralized Multi-player Multi-armed Bandits with No Collision Information, Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang
    • First and Second algorithm from Multiplayer bandits without observing collision information, Gabor Lugosi, Abbas Mehrabian
    • MCTopM, SelfishUCB from Multi-Player Bandits Revisited, Lilian Besson, Emilie Kaufmann
    • Musical Chairs from Multi-Player Bandits – a Musical Chairs Approach, Jonathan Rosenski, Ohad Shamir, Liran Szlak
    • Randomized SelfishUCB from A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information, Cindy Trinh, Richard Combes


  • Unimodal and Rank-one bandits
    • Bernoulli Rank-1 Bandits for Click Feedback, Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
    • Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms, Richard Combes, Alexandre Proutiere
    • Unimodal Thompson Sampling for Graph-Structured Arms, Stefano Paladino, Francesco Trovò, Marcello Restelli, Nicola Gatti

Projects

  • Recommender Systems: A recommender system trained on Jester Dataset, using collaborative filtering. [Still working on it. This is my minimum viable product at the moment!]


  • MLPerf Datasets: A program which processes a given dataset such as Google Open Images, to mimic the features of Imagenet/ADE20K/COCO datasets. I wrote this program in collaboration with Michael Buch and within the mobile team of MLCommons. This program was then used for the mobile app of MLCommons.



Implementations for my personal understanding

I like to reimplement algorithms or play with new concepts I learn to get a better understanding of their inner workings and subtleties. These repositories contain several algorithms that I have studied either in class (sometimes in group), doing research or on my own.

  • Machine Learning Algorithms: Fundamentals of machine learning algorithms/concepts, and applications to toy problems.
    • Logistic regression
    • Linear Discriminant Analysis (LDA)/ Quadratic Discriminant Analysis (QDA)
    • Expectation-Maximization algorithm
    • K-means
    • Principal Component Analysis (PCA) and Singular Value Decomposition (SVD)



  • Trajectory Optimization: Optimal control algorithms, elegant theory, and interesting to compare to black box Reinforcement Learning algorithms for simple problems such as cartpole
    • Linear Quadratic Regulator (LQR)
    • iterative Linear Quadratic Regulator (iLQR)