An index for my github repositories.
Research paper implementations
- Linear Contextual Bandits
- Mostly Exploration-Free Algorithms for Contextual Bandits, Hamsa Bastani, Mohsen Bayati, Khashayar Khosravi
- Adaptive Exploration in Linear Contextual Bandit, Botao Hao, Tor Lattimore, Csaba Szepesvari
- Multi-player Multi-armed Bandits
- Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions, Sébastien Bubeck, Thomas Budzinski, Mark Sellke
- SIC-MMAB, SIC-MMAB2 and DYN-MMAB algorithms from SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits, Etienne Boursier, Vianney Perchet
- EC-SIC from Decentralized Multi-player Multi-armed Bandits with No Collision Information, Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang
- First and Second algorithm from Multiplayer bandits without observing collision information, Gabor Lugosi, Abbas Mehrabian
- MCTopM, SelfishUCB from Multi-Player Bandits Revisited, Lilian Besson, Emilie Kaufmann
- Musical Chairs from Multi-Player Bandits – a Musical Chairs Approach, Jonathan Rosenski, Ohad Shamir, Liran Szlak
- Randomized SelfishUCB from A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information, Cindy Trinh, Richard Combes
- Unimodal and Rank-one bandits
- Bernoulli Rank-1 Bandits for Click Feedback, Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
- Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms, Richard Combes, Alexandre Proutiere
- Unimodal Thompson Sampling for Graph-Structured Arms, Stefano Paladino, Francesco Trovò, Marcello Restelli, Nicola Gatti
Projects
- Recommender Systems: A recommender system trained on Jester Dataset, using collaborative filtering. [Still working on it. This is my minimum viable product at the moment!]
- MLPerf Datasets: A program which processes a given dataset such as Google Open Images, to mimic the features of Imagenet/ADE20K/COCO datasets. I wrote this program in collaboration with Michael Buch and within the mobile team of MLCommons. This program was then used for the mobile app of MLCommons.
- Goose Game: An implementation of the Goose Game (“Jeu de l’oie”) in Java.
Implementations for my personal understanding
I like to reimplement algorithms or play with new concepts I learn to get a better understanding of their inner workings and subtleties. These repositories contain several algorithms that I have studied either in class (sometimes in group), doing research or on my own.
- Machine Learning Algorithms:
Fundamentals of machine learning algorithms/concepts, and applications to toy problems.
- Logistic regression
- Linear Discriminant Analysis (LDA)/ Quadratic Discriminant Analysis (QDA)
- Expectation-Maximization algorithm
- K-means
- Principal Component Analysis (PCA) and Singular Value Decomposition (SVD)
- Reinforcement Learning Algorithms: Some implementations of classic algorithms, applied to toy environments such as cartpole
- Reinforce
- Actor Critic
- Trajectory Optimization: Optimal control algorithms, elegant theory, and interesting to compare to black box Reinforcement Learning algorithms for simple problems such as cartpole
- Linear Quadratic Regulator (LQR)
- iterative Linear Quadratic Regulator (iLQR)