Tether, the company behind the widely used USDT stablecoin, has released a groundbreaking medical AI model that challenges the industry trend of scaling up parameters for better performance. The new QVAC MedPsy family, developed by Tether’s AI Research Group, is designed to operate directly on smartphones, wearables, and edge devices without any cloud dependency.
Key performance highlights: The 1.7 billion-parameter variant scored an average of 62.62 across seven closed medical benchmarks, surpassing Google’s MedGemma-1.5-4B-it by 11.42 points. On HealthBench Hard—an OpenAI benchmark evaluating realistic multi-turn clinical conversations and graded by 262 physicians—the same 1.7B model even outperformed MedGemma-27B, a model nearly 16 times larger. The 4 billion-parameter version scored 70.54 on the same set, exceeding models roughly seven times its size.
Equally impressive is the token efficiency: the 4B model generates responses using about 909 tokens, compared to 2,953 for comparable systems—a 3.2x reduction. This lowers compute costs, accelerates response times, and crucially allows fully local execution, keeping sensitive patient data on-premises and away from HIPAA exposure risks.
Both models are released in quantized GGUF format (1.2 GB and 2.6 GB) that retains most benchmark performance while fitting on standard consumer hardware. CEO Paolo Ardoino emphasized the privacy and practical benefits: “You can run medical reasoning where the data already exists, inside a hospital system or on a device, without moving sensitive information through the cloud or waiting on external processing.”
The release follows Tether’s earlier QVAC SDK for local AI app development and the QVAC Health consumer app. With the medical AI market projected to surpass $500 billion by 2033, Tether’s efficiency-first approach positions it as a notable player in on-device healthcare AI.