Posts

Showing posts with the label memory reduction

Benchmarking Dynamic Quantization for Larger Language Models