Posts by Collection

notes

Notes — Multi-modal LLMs for Medical Imaging

Published: January 05, 2025

Key Ideas

Vision-language alignment needs clinical context supervision.
Reporting-style training targets may improve factual consistency.

一、核心思想

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

progress

Goals

Define Q1 research focus in medical image analysis.
Survey recent multi-modal LLM papers relevant to clinical imaging.

publications

Transformer-based visual question answering model comparison

Published in Journal of Physics: Conference Series, 2023

A comparative analysis of LXMERT and UNITER models for visual question answering tasks.

Recommended citation: Zhicheng He, Yuanzhi Li, Dingming Zhang. (2023). "Transformer-based visual question answering model comparison." Journal of Physics: Conference Series, 2646(1), 012031.
Download Paper

Deep Learning-Based Detection of Impacted Teeth on Panoramic Radiographs

Published in Biomedical Engineering and Computational Biology, 2024

Fine-tuning MedSAM for impacted tooth segmentation in X-ray images to aid dental diagnoses.

Recommended citation: He Zhicheng, Wang Yipeng, Li Xiao. (2024). "Deep Learning-Based Detection of Impacted Teeth on Panoramic Radiographs." Biomedical Engineering and Computational Biology, 15, 11795972241288319.
Download Paper

Endouic: Promptable diffusion transformer for unified illumination correction in capsule endoscopy

Published in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024

A WCE unified illumination correction solution using an end-to-end promptable diffusion transformer (DiT) model.

Recommended citation: Long Bai, Tong Chen, Qiaozhi Tan, Wan Jun Nah, Yanheng Li, Zhicheng He, Sishen Yuan, Zhen Chen, Jinlin Wu, Mobarakol Islam, Zhen Li, Hongbin Liu, Hongliang Ren. (2024). "Endouic: Promptable diffusion transformer for unified illumination correction in capsule endoscopy." International Conference on Medical Image Computing and Computer-Assisted Intervention, Pages 296-306.
Download Paper

Graph Matching Based Graph Self-Supervised Learning for Molecular Property Prediction

Published in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024

A graph self-supervised learning framework utilizing graph matching for molecular property prediction.

Recommended citation: Hongxiang Lin, Yixiao Zhou, Huiying Hu, Zhicheng He, Runzhi Wu, Xiaoqing Lyu. (2024). "Graph Matching Based Graph Self-Supervised Learning for Molecular Property Prediction." 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pages 7092-7094.
Download Paper

Multimodal Content Alignment with LLM for Visual Presentation of Papers

Published in International Conference on Document Analysis and Recognition (ICDAR), 2025

Paper2PPT framework for generating visual presentations of scientific papers with cross-modal alignment.

Recommended citation: Huiying Hu, Zhicheng He, Yixiao Zhou, Tongwei Zhang, Xiaoqing Lyu. (2025). "Multimodal Content Alignment with LLM for Visual Presentation of Papers." International Conference on Document Analysis and Recognition, Pages 238-256.
Download Paper

SSSI: Self-prompted Segmentation of Scientific Illustrations

Published in International Conference on Document Analysis and Recognition (ICDAR), 2025

A self-prompted segmentation framework for scientific illustrations using SAM-based methods.

Recommended citation: Tuo Wang, Yixiao Zhou, Tongwei Zhang, Zhicheng He, Yumeng Zhao, Xiaoqing Lyu. (2025). "SSSI: Self-prompted Segmentation of Scientific Illustrations." International Conference on Document Analysis and Recognition, Pages 347-361.
Download Paper

Incentivizing DINOv3 Adaptation for Medical Vision Tasks via Feature Disentanglement

Published in Medical Imaging with Deep Learning (MIDL), 2026

A task-oriented feature disentanglement framework (DINOv3-FD) for parameter-efficient adaptation of DINOv3 to medical vision tasks.

Recommended citation: Zhicheng He, Yibing Fu, Yueming Jin. (2026). "Incentivizing DINOv3 Adaptation for Medical Vision Tasks via Feature Disentanglement." Medical Imaging with Deep Learning.
Download Paper

MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction

Published in arXiv preprint arXiv:2602.14512, 2026

An autoregressive foundation model for scalable medical image generation with next-scale prediction.

Recommended citation: Zhicheng He, Yunpeng Zhao, Junde Wu, Ziwei Niu, Zijun Li, Bohan Li, Lanfen Lin, Yueming Jin. (2026). "MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction." arXiv preprint arXiv:2602.14512.
Download Paper

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015