Resources for DS & ML & DL

Anh-Thi Dinh

Blogs & Tuts

  • Airbnb — Engineering & Data Science – Medium.
  • AI Curious — Viet Anh's personal blog, in Vietnamese.
  • Google's AI Hub -- a platform that lets us centralize our code and knowledge in a way that can step up the pace of deployment and learnings globally, giving us the scale to deliver data-driven marketing excellence.
  • Google Codelabs -- Google Developers Codelabs provide a guided, tutorial, hands-on coding experience. Most codelabs will step you through the process of building a small application, or adding a new feature to an existing application. They cover a wide range of topics such as Android Wear, Google Compute Engine, Project Tango, and Google APIs on iOS.

Books

Services & API

  • Mapbox — Precise location data and powerful developer tools to change the way we navigate the world.
  • OpenStreetMap — a map of the world, created by people like you and free to use under an open license.

Frameworks

  • Caffe — deep learning framework.
  • D3js — Data-Driven Documents.
  • Hydra — A framework for elegantly configuring complex applications. It's Facebook's.

Python libs

  • daft — a Python package that uses matplotlib to render pixel-perfect probabilistic graphical models for publication in a journal or on the internet.
  • CSAPS — a Python package for univariate, multivariate and n-dimensional grid data approximation using cubic smoothing splines. The package can be useful in practical engineering tasks for data approximation and smoothing.

For Vietnamese

  • KbQAS (ISWC 2013): Video demo of the knowledge-based Vietnamese question answering system KbQAS.
  • PhoBERT (EMNLP 2020 Findings): Pre-trained language models for Vietnamese.
  • PhoW2V (2020): Pre-trained Word2Vec syllable- and word-level embeddings for Vietnamese.
  • RDRsegmenter (LREC 2018): A fast and accurate Vietnamese word segmenter.
  • ViText2SQL (EMNLP 2020 Findings): A dataset for Vietnamese Text2SQL semantic parsing.
  • VnCoreNLP (NAACL 2018): A Vietnamese NLP pipeline of word (and sentence) segmentation, POS tagging, named entity recognition and dependency parsing.
  • VnDT (NLDB 2014): A Vietnamese dependency treebank.
  • VnMarMoT (ALTA 2017): A pre-trained Vietnamese POS tagging model.

Tools

  • DeepKit -- The collaborative real-time open-source machine learning devtool and training suite: Experiment execution, tracking, and debugging. With server and project management tools.
  • Flourish — Data Visualization & Storytelling.
  • Foursquare — Put the most trusted, independent location data and technology platform to work for your business.
  • idyll — A toolkit for creating data-driven stories and explorable explanations.
  • Mapbox — Maps and location for developers.
  • ml5js -- Friendly Machine Learning For The Web.
  • nbdev — Create delightful python projects using Jupyter Notebooks.
  • Observale — Observable is the magic notebook for exploring data and thinking with code.
  • Streamlit — The fastest way to build data apps in Python.
  • Replicate — Version control for machine learning.
  • Travis-CI — a hosted continuous integration service used to build and test software projects hosted at GitHub and Bitbucket.
  • Vaex — Handle huge dataframe.