Changes in AI Model Testing

I am tweaking my methodology and system tools for testing AI models. 

Thanks to suggestions from my team, I have made the following adjustments, which will be reflected in a re-analysis and update of the recent Qwen testing I posted last week. 

  • Changes:
    • Increased allowances for thinking/reasoning models in terms of response times to allow for increased thought loops and Multiple Experts (ME) models
    • Increased tolerances for speed and handling concerns on the testing systems. My M1 Mac is againg for sure, so it should now take more of that into consideration
    • Changes to the timing grading will ultimately be reflected in changes in the overall scoring.

 

Leave a comment