Exllama
exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GPUs while minimizing memory usage and supporting various hardware configurations.
Content is being generated for this tool. Please check back soon!
You Might Also Like
Sweephy
Sweephy aids finance-related companies in monitoring regulations by pr...
KindlLM
Kindllm is an AI-powered chat app designed for Kindle Paperwhites, off...
Andes
##andes is a marketplace offering diverse large language model APIs fo...
onedollarai.lol
OneDollarAI.lol provides affordable access to advanced large language ...
Oobabooga
The text-generation-webui is a Gradio-based web UI for Large Language ...
Awan LLM
Awan LLM is an AI inference API that offers unlimited token access for...
LLM Answer Engine
LLM-answer-engine is an advanced answer engine leveraging Groq, Mixtra...
LLMWizard
LLMWizard offers access to multiple AI models like GPT-4o and DALL-E 3...
Llama.cpp
Llama.cpp is an open-source tool for efficient inference of large lang...
GPT-4
We've developed GPT-4, a large multimodal model that exhibits human-le...
AutoArena
Autoarena is an open-source platform for evaluating generative AI syst...
Inceptionlabs - Mercury coder
Inception Labs' diffusion-based large language models (dLLMs) offer fa...