Running llama2.c on Milk V Duo. Output: "One day, there was little computer small and mighty horse. He wanted to perform a magical trick for all of his friends, but nothing seemed to work. The horse tried and tried, but he just couldn't make it go. Then one day, a little girl came into the room. She saw the horse trying to perform a special trick. "Hello little horse!" the girl said. "I can perform a spin!" The horse was so excited that he started to spin around and around in circles. But, nothing else happened. "It's okay, little one," the girl said. "I'm just having a good time." The horse was relieved. He was able to perform the big trick that the girl was trying to show. He was so relieved. The girl smiled and said, "Let's do this again tomorrow. Maybe that will make you feel mighty too!" And the horse smiled and kept on performing his trick until his tired bellies were so tired that he fell asleep! achieved tok/s: 0.253691"
I just got llama2.c running on the Milk V Duo. Compiled using the offical Milk V toolchain. Used the smallest stories15M model and took about 10 minutes or so (I didn't count) to generate.
However, this is only running on the cpu, with the built in npu we […]
[Original post on mastodon.social]