Data Validation tools

GuardRails

This is an open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs).

https://www.guardrailsai.com/docs

NeMo (by Nvidia)

This is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.

https://github.com/NVIDIA/NeMo-Guardrails

TypeChat (by Microsoft)

This is a library that makes it easy to build natural language interfaces using types.

OpenAI Function

While it's possible to produce structured output without using function calling via careful prompting, function calling is more reliable at producing output that conforms to a particular format.

https://platform.openai.com/docs/guides/function-calling

Langchain Output Parser

Output parsers are responsible for taking the output of an LLM and transforming it to a more suitable format. This is very useful when you are using LLMs to generate any form of structured data.

https://python.langchain.com/docs/modules/model_io/output_parsers/