Sheets is a tool for building datasets using AI models. It offers:
- Real-time iteration: Building high-quality and diverse datasets involves carefully designing and combining prompts, trying out different models, a lot of trial and error, and spending time looking at your data.
Sheetsaccelerates dataset iteration with an interactive and progressive workflow, enabling you to test many things and see the results instantly. - In-context learning using human demonstrations: One of the biggest frustrations when building datasets with AI is prompts' brittleness. You often need to spend hours tuning the language of your prompt to avoid specific failures, ensure correct formatting, etc. Adding few-shot examples to your prompt is one of the most effective solutions to these issues. However, writing these examples by hand is time-consuming and challenging. In
Sheets, you just need to edit/select good examples, which are automatically included in the data generation process. - The latest open-source models:
Sheetsenables you to use the latest and most powerful models, thanks to Hugging Face Inference Providers. - Cost-efficiency: Instead of launching 100s of inference calls to experiment with prompts and pipelines,
Sheetsenables you to test and build in smol steps (a few rows at a time!). This saves money and energy and leads to higher-quality datasets; you get to look at your data and tune the generation process as you go. - Go from smol to great: Many big things, like the universe, start from something very smol. To build great datasets, it's better to build the perfect small dataset for your use case and then scale it up.
Sheetsenables you to build datasets and pipelines progressively. Once you're satisfied with your dataset, you can use the generated configuration to scale up the size of your dataset (if needed).
dg_wip.mp4
https://marketplace.visualstudio.com/items?itemName=rluvaton.vscode-vitest
https://marketplace.visualstudio.com/items?itemName=biomejs.biome
This project is using Qwik with QwikCity. QwikCity is just an extra set of tools on top of Qwik to make it easier to build a full site, including directory-based routing, layouts, and more.
Inside your project, you'll see the following directory structure:
├── public/
│ └── ...
└── src/
├── components/ --> Stateless components
│ └── ...
├── features/ --> Components with business logic
│ └── ...
└── routes/
└── ...
-
src/routes: Provides the directory-based routing, which can include a hierarchy oflayout.tsxlayout files, and anindex.tsxfile as the page. Additionally,index.tsfiles are endpoints. Please see the routing docs for more info. -
src/components: Recommended directory for components. -
public: Any static assets, like images, can be placed in the public directory. Please see the Vite public directory for more info.
Run this on your root folder
touch .env.localAdd in your .env.local file the following variables:
OAUTH_CLIENT_ID
HF_TOKEN=X
Please note that if you define the HF_TOKEN, this variable will take priority over OAUTH_CLIENT_ID.
Development mode uses Vite's development server. The dev command will server-side render (SSR) the output during development.
pnpm devNote: during dev mode, Vite may request a significant number of
.jsfiles. This does not represent a Qwik production build.
The preview command will create a production build of the client modules, a production build of src/entry.preview.tsx, and run a local server. The preview server is only for convenience to preview a production build locally and should not be used as a production server.
pnpm previewThe production build will generate client and server modules by running both client and server build commands. The build command will use Typescript to run a type check on the source code.
pnpm buildThis app has a minimal Express server implementation. After running a full build, you can preview the build using the command:
pnpm serve
Then visit http://localhost:8080/