Lima Vallantin
Wilame
Marketing Data scientist and Master's student interested in everything concerning Data, Text Mining, and Natural Language Processing. Currently speaking Brazilian Portuguese, French, English, and a tiiiiiiiiny bit of German. Want to connect? Tu peux m'envoyer un message. Pour plus d'informations sur moi, tu peux visiter cette page.

Sommaire

N'oublies pas de partager :

Partager sur linkedin
Partager sur twitter
Partager sur facebook

N'oublies pas de partager :

Partager sur linkedin
Partager sur twitter
Partager sur whatsapp
Partager sur facebook

Have you ever tried to install a package from GitHub on your Google Colab? Some time ago, I had this same question. With a little research, I found the solution, which I share with you in this post. And, believe me, importing a Python package from GitHub is easier than you think!

Why is it so important?

Learning how to import a package may seem a secondary task, but sometimes you will want to use a module or something you created on a Tensorflow project.

You can upload documents and other stuff using the Colab interface. But if you stop a session, you will have to manually upload everything back.

A simpler way of uploading things is cloning a GitHub repository and using command line to automate this step.

Import and install

First thing to do is cloning the repository you want to use.

Then use %cd to navigate to the directory you want.

⚠️ If you have run the clone command before, you will see a message saying that the files are already downloaded. You can restart your session (everything will be erased) or use the pull command to get the changes made on the repository before installing the requirements and the package itself.

The package I am trying to install depends on other packages to work. Let’s install these using the !pip command and the requirements.txt file.

Then, I will proceed to the install of the package itself. Use the command !python setup.py install to do it.

The last thing to do is importing your package!

# clone package repository
!git clone https://github.com/vallantin/atalaia.git

# navigate to atalaia directory
%cd atalaia

# get modifications made on the repo
!git pull origin master

# install packages requirements
!pip install -r requirements.txt

# install package
!python setup.py install

# import it
from atalaia.atalaia import Atalaia

Installing packages can have unexpected effects on your environment.

Different packages can have the same dependencies, but use different versions of them.

If you find problems, try to force/re-install the packages that are not working.

#!pip install --upgrade --force-reinstall **some_package_here**

Finally, import the other packages and pray for the best 🙈. Hopefully, you won’t spend one hour debugging as I did.

# do other imports to test if we have errors
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

import numpy as np
import pandas as pd

from pprint import pprint

Conclusion: what we learned today

Importing external packages is useful when you need to use something you developed locally.

Since not everyone has an external GPU at home, training Deep Learning models on Colab is very handy.

N'oublies pas de partager :

Partager sur linkedin
Partager sur twitter
Partager sur whatsapp
Partager sur facebook

Laisser un commentaire