10 Useful Repositories for Data Science & Machine Learning

10 Useful GitHub Repositories for everyone working with DS and ML

10 Useful Repositories for Data Science & Machine Learning

GitHub is the Instagram, Facebook, Snapchat, and basically everything for programmers. It is not only one of the most popular code management tools out there but also the platform that is the biggest contributor to popularizing the open-source culture. It is a programmer’s passport for learning, implementing, and getting into a wonderful career at some of the world’s most prestigious tech companies.

In this article, we will go through some of the most useful GitHub repositories for Data Science and Machine Learning enthusiasts. So let’s begin in no particular order-

1. The Algorithm

Data Structures is the most popular DS word in computer science, maybe followed by Data Science. But data structures must be known by every computer science student, it is one of the most important to learn if you plan on pursuing and long career in this field. And what better way to learn it than through code. There is a piece of the pie for every kind of CS aspirant in this collection. And whether you are an ML engineer, Web dev, Mobile dev, or an undergrad student this is one repository that you should have in your bookmark collection. They also have a website for viewing and running the code in more than 10 popular languages.

Algorithms implemented in various languages

2. 100-Days-Of-ML-Code

The name is self-explanatory for this repo, it contains a perfect 100-day plan for learning ML. It has some very valuable contributions from a bunch of open-source enthusiasts and it also has the datasets included within the repo. One other notable feature of this repository is the graphical posters for each day which is like a summary of the daily learning plan, very useful if printed posters motivate you to stay on track.

100 Days of ML daily learning plans

3. data-science

This repository is all about an opportunity for those people who want to complete the Data Science Undergraduate curriculum on their own time, for free, with courses from some of the best universities in the World. In their curriculum, more preference is given to MOOC (Massive Open Online Course) style courses because these courses are better suited for self-paced learning.

Students who are following this path can interact with each other through different community channels, GitHub issues, the repo’s Discord server, etc. It really brings a feeling of collective learning and in today’s open world free education initiatives like this should be the way to go rather than expensive courses.

Data science learning curriculum

4. public-apis

Well, not much to say here except that is one of the best collections of public APIs available for every kind of developer out there. It has a very large list of APIs categorized in a very comprehensive fashion. The best part is the table which gives you a basic idea of each API, regarding Auth, HTTP & CORS status of each API without having to read their individual documentation. Needless to say, it can be very handy when you’re planning on creating sample data sets for your project as well.

5.awesome-machine-learning

Just like the name, this repository contains a curated list of awesome machine learning frameworks, libraries, and software. It has a categorized list of frameworks and libraries for various languages and machine learning tools. If you are starting with machine learning bookmarking this repository can helo you find the right tools for your tasks.

6. project-based-learning

The companies I have worked for always used to purchase expensive coursed for software engineers to pick up new topics or brush up on old skills. But listening to hours of tutorials is some that personally bores me and after one or two lessons I quickly get into a project building mode. It may not be everyone’s idea of learning but in my experience, the potential of learning from projects is much higher compared to others. Of course, you need syntactical and theory knowledge, but to face real-time problems and solve issues you may get into in the future you must try implementing them in at least one good project using the language or skill you are learning.

Whether it is Web/Mobile development, Machine Learning, or Data Science, this repository has a good collection of projects that you can refer to for learning and future use. The list is categorized language-wise and under each language, you can find numerous projects from various topics.

7. Complete-Python-3-Bootcamp

Python is a language that has become so popular that people no longer think about the reptile when they hear the word. It has become the go-to language for Data Science, Machine learning, AI, or even web dev. This Repository contains the files of one of the most popular, highly rated, and complete Python 3 Bootcamp Course on Udemy.

8. awesome-python

As mentioned above there is no question about the popularity of python in 2022. And one of the main reasons for this popularity is the large number This repository has a curated list of libraries, frameworks, and resources specifically for python. Whether it’s about building admin dashboards or working with web sockets this repository covers all.

Python Frameworks and Libraries

9. tensorflow

If you are into ML and data science you probably know about Tensorflow already. It is an end-to-end open-source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google’s Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

Tensor Flow Repo

10.DeepLearning-500-questions

As the name suggests this repository is a comprehensive body of knowledge on Deep Learning and Artificial Intelligence. It contains a collection of articles on the mathematical and technical aspects of Deep Learning and Artificial Intelligence that will help you build a strong foundation of knowledge in the respective domain.

Most popular DL and AI questions

Bonus

11. gitignore

If you’re working with git for your projects you would’ve come across this hidden file which has more significance than what you would’ve thought. This tiny file and its few lines of code is what keeps your repository light-weight and easily accessible for your teammate, without which basic pull/push would take hours with all those clutter from libraries or packages.

This repository is maintained by GitHub itself and it is a collection of [.gitignore](http://git-scm.com/docs/gitignore) file templates for every possible language or framework that you can think of. So next time you accidentally edit or delete this tiny but important file head over to this repo to grab a copy.

gitignore files for various language and frameworks

12. metrics

Building open-source projects or contributing to one is a good way to learn and showcase your new and existing skills. Gone are the days when companies hired from objective-type tests, Github, and Linkedin profiles are going to be the game changes for people’s careers going forward. Your projects and contributions on Github can tell tech recruiters a lot about the kind of developer you are and where your focus lies. So it is always important to maintain a good Github profile.

Metrics is a tool that can be used to generate infographics about you so you can embed them on your GitHub profile to let other users know more about you. Display your git stats along with an isometric calendar, favorites music, website performances, last tweets, projects, languages, etc. It may look like overkill but if used correctly it can really help your profile stand out.

Changing the look of GitHub profile using Metrics

Let me end the article by putting up a disclaimer that I am not personally associated with any of the repositories or tools mentioned above nor have I received any incentives to promote them. Feel free to add more repositories that you think would be useful in the comments.

Did you find this article valuable?

Support I Speak Code by becoming a sponsor. Any amount is appreciated!