I have built a dataset using the Reddit API made of descriptions from real people about their diseases and symptoms. The result is a NER model that you can use to extract medical entities from text.
Skills & tools: Python, Rest API, SQL, Pandas, Spacy, Hugging face Transformers, Docker, Azure, data mining, and data cleaning.
I have been using Natural Language Processing for text mining and for extract useful information from data.
For Essencialia, I use techniques such as POS-Tagging (to extract locales from text), Sentiment Analysis (to look for news articles about violent districts) and parsing techniques to clean and process text.
Skills & tools: Spacy, Python, Regex, scikit-learn, Machine Learning, Natural Language Processing, Aruana & Atalaia (proprietary Python libraries that I have created for text processing).
Development of the full ETL process for Essencialia.com. I scrape, daily, real estate listings’ sites using Selenium and Python to extract unstructured real estate data for the city of Aracaju (Brazil).
Data is preprocessed, cleaned and loaded into a Postgres database.
Skills & tools: Selenium, Python, Postgre, SQL, BeautifulSoup, Pandas, Statistics.
Computers can see. And the consequences and opportunities are huge. In 2019, I developed this small program to anonymize faces in real-time videos thinking about data privacy.
As a Data Scientist, I will always work to mitigate the bad consequences of using AI. This means no assassin drones, no voice assistants hearing someone’s private conversations, and no malicious use of AI.
Ethics is not a choice. It’s a responsibility towards society.
Skills & tools: OpenCV, Computer Vision, Face detection, Data Privacy.
All the Real Estate findings are presented on Essencialia.com. I use a mix of Tableau embeddings and HTML to present the information in a fun and interactive way.
Data is analysed and transferred to a WordPress website using PHP. During the exploratory analysis step, I use Matplotlib, Pandas, SQL, and Python to explore data.