Notes — Multi-modal LLMs for Medical Imaging
Published:
Key Ideas
- Vision-language alignment needs clinical context supervision.
- Reporting-style training targets may improve factual consistency.
Questions
- How to inject structured radiology priors without overfitting?
- What is the right balance of image vs. text token budgets?
References
- Add citations/links here.