Notes — Multi-modal LLMs for Medical Imaging
Published:
Key Ideas
- Vision-language alignment needs clinical context supervision.
- Reporting-style training targets may improve factual consistency.
Published:
Key Ideas
Published:
Short description of portfolio item number 1
Short description of portfolio item number 2 
Published:
Published in Journal of Physics: Conference Series, 2023
A comparative analysis of LXMERT and UNITER models for visual question answering tasks.
Recommended citation: Zhicheng He, Yuanzhi Li, Dingming Zhang. (2023). "Transformer-based visual question answering model comparison." Journal of Physics: Conference Series, 2646(1), 012031.
Download Paper
Published in Biomedical Engineering and Computational Biology, 2024
Fine-tuning MedSAM for impacted tooth segmentation in X-ray images to aid dental diagnoses.
Recommended citation: He Zhicheng, Wang Yipeng, Li Xiao. (2024). "Deep Learning-Based Detection of Impacted Teeth on Panoramic Radiographs." Biomedical Engineering and Computational Biology, 15, 11795972241288319.
Download Paper
Published in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024
A WCE unified illumination correction solution using an end-to-end promptable diffusion transformer (DiT) model.
Recommended citation: Long Bai, Tong Chen, Qiaozhi Tan, Wan Jun Nah, Yanheng Li, Zhicheng He, Sishen Yuan, Zhen Chen, Jinlin Wu, Mobarakol Islam, Zhen Li, Hongbin Liu, Hongliang Ren. (2024). "Endouic: Promptable diffusion transformer for unified illumination correction in capsule endoscopy." International Conference on Medical Image Computing and Computer-Assisted Intervention, Pages 296-306.
Download Paper
Published in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024
A graph self-supervised learning framework utilizing graph matching for molecular property prediction.
Recommended citation: Hongxiang Lin, Yixiao Zhou, Huiying Hu, Zhicheng He, Runzhi Wu, Xiaoqing Lyu. (2024). "Graph Matching Based Graph Self-Supervised Learning for Molecular Property Prediction." 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pages 7092-7094.
Download Paper
Published in International Conference on Document Analysis and Recognition (ICDAR), 2025
Paper2PPT framework for generating visual presentations of scientific papers with cross-modal alignment.
Recommended citation: Huiying Hu, Zhicheng He, Yixiao Zhou, Tongwei Zhang, Xiaoqing Lyu. (2025). "Multimodal Content Alignment with LLM for Visual Presentation of Papers." International Conference on Document Analysis and Recognition, Pages 238-256.
Download Paper
Published in International Conference on Document Analysis and Recognition (ICDAR), 2025
A self-prompted segmentation framework for scientific illustrations using SAM-based methods.
Recommended citation: Tuo Wang, Yixiao Zhou, Tongwei Zhang, Zhicheng He, Yumeng Zhao, Xiaoqing Lyu. (2025). "SSSI: Self-prompted Segmentation of Scientific Illustrations." International Conference on Document Analysis and Recognition, Pages 347-361.
Download Paper
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.