Blog | Tanuki

Scraping the web with Tanuki

November 25, 2023 · 11 min read

Ben Magolan

Thumbnail alt_text

Introduction

We have received several questions regarding how Tanuki can be used in conjunction with web scraping and want to dedicate this post to share a couple quick strategies that we have found helpful.

We'll explore four examples for how to use Tanuki with BeautifulSoup and Selenium to scrape a website and extract the desired content fields into a pydantic class data structure:

Quotes from their div element
Cocktail recipe assembled from entire page
Rental units with User-Agent to mimic browser session
Airbnb listings with Selenium webdriver

The code for these examples can be found at the following links:

Tanuki repo here
Use-case here

Deploying LLM-powered webapps with Vercel and Tanuki

November 22, 2023 · 14 min read

Michael Seltenreich

Thumbnail alt_text

Introduction

For many of us web-developers, the recent LLM craze is something that we have been often hearing about, but not really using. We know that LLMs are powerful, but we don't really know how to use them in our applications. One of the main reasons is the mismatch between the technologies that lie at the core of LLMs and those used to build web applications. LLMs are typically written in Python, and so are the libraries that are used to interact with LLMs. Many of us web developers are not familiar with Python or not sure how to integrate Python into our development stacks. And even when we do, we often find that the idea of designing the entire application in Python just to use LLMs is not very appealing.

This challenge will disappear soon once we release release Tanuki's TypeScript library. But until then, this tutorial provides the next best thing.

What are AI functions?

November 14, 2023 · 5 min read

Daniel Kwak

AI functions in a nutshell

In early 2023, we started to see the emergence of “AI functions”, such as this post by Databricks. Although the exact definition is still being crystallized, we wanted to explain the high-level concept of AI functions and the motivation behind them.

In Python, functions are fundamental building blocks of software that can be characterized as 1) formally-defined (in input, output, and behavior), 2) modular, 3) easy to implement and invoke, 4) having variable time and memory complexity.

Large language models (LLMs) on the other hand are 1) ill-defined (prompt wrangling is required and outputs can be very different); 2) particularistic (high inter-model variability); 3) difficult to call in-code; 4) constant execution latency and memory consumption. However, they are incredibly powerful and can perform difficult tasks without requiring direct implementation details from the application developer i.e. classifying 1000s of emails; generating poems; extracting key insights from a Tweet.

AI functions are a way of combining the structure and definition of classic functions with the power and versatility of LLMs, enabling developers to create AI-native software. In practice, an AI function is the regularization of input and output to an LLM, so that such a function can be called in code and behave just like a normal function, but without any code supplied by the developer.

Building a Google News Scraper in 10 Minutes

November 13, 2023 · 6 min read

Daniel Kwak

Thumbnail alt_text

Introduction

We learned about this project when a trader wanted to build a productivity app that emailed him news articles that are relevant to his investment strategies. Because keywords like "NVIDIA" and "GDPR" lead to a lot of noise (imagine getting every article written about NVIDIA these days), he needed something more accurate that could work with various topics and concepts, ranging from a specific company's earnings to global events.

Given he wanted to cover 1000s of articles each day, using GPT-4 and related APIs was cost prohibitive. At $0.06/1K tokens, he'd be looking at $60-100/day, or $1,800-3,000/month, which was too much for a productivity tool (source).

Therefore, he was able to build this Python app in an hour using Tanuki. Now, we want to show you how to do the same. If you're impatient, you can find the repo and use-case in the following links:

Tanuki repo here
Use-case here

Introduction​

Introduction​

AI functions in a nutshell​

Introduction​

Introduction

Introduction

AI functions in a nutshell

Introduction