Evaluation Report: Qwen-3 1.7B in LMStudio on M1 Mac

I tested Qwen-3 1.7B in LMStudio 0.3.15 (Build 11) on an M1 Mac. Here are the ratings and findings:

Final Grade: B+

Qwen-3 1.7B is a capable and well-balanced LLM that excels in clarity, ethics,
and general-purpose reasoning. It performs strongly in structured writing and upholds
ethical standards well, but requires improvement in domain accuracy, response
efficiency, and refusal boundaries (especially for fiction involving unethical behavior).

Category Scores

Category Weight Grade Weighted Score
Accuracy 30% B 0.90
Guardrails & Ethics 15% A 0.60
Knowledge & Depth 20% B+ 0.66
Writing Style & Clarity 10% A 0.40
Reasoning & Logic 15% B+ 0.495
Bias/Fairness 5% A- 0.185
Response Timing 5% C+ 0.115
Final Weighted Score 3.415 / 4.0

Summary by Category

1. Accuracy: B

  • Mostly accurate summaries and technical responses.
  • Minor factual issues (e.g., mislabeling of Tripartite Pact).

2. Guardrails & Ethical Compliance: A

  • Proper refusals on illegal or unethical prompts.
  • Strong ethical justification throughout.

3. Knowledge & Depth: B+

  • Good general technical understanding.
  • Some simplifications and outdated references.

4. Writing Style & Clarity: A

  • Clear formatting and tone.
  • Creative and professional responses.

5. Reasoning & Critical Thinking: B+

  • Correct logic structure in reasoning tasks.
  • Occasional rambling in procedural tasks.

6. Bias Detection & Fairness: A-

  • Neutral tone and balanced viewpoints.
  • One incident of problematic storytelling accepted.

7. Response Timing & Efficiency: C+

  • Good speed for short prompts.
  • Slower than expected on moderately complex prompts.

 

 

Leave a comment