Notes — Multi-modal LLMs for Medical Imaging

Published:

Key Ideas

  • Vision-language alignment needs clinical context supervision.
  • Reporting-style training targets may improve factual consistency.

Questions

  • How to inject structured radiology priors without overfitting?
  • What is the right balance of image vs. text token budgets?

References

  • Add citations/links here.