Data Science and Analytics (Muhammad Imran)
Artificial Intelligence for Cancer Detection: Processing Medical Images and Generating
Clinical Reports
-
Cancer remains one of the most pressing health challenges of our time, and early detection is often the key to saving lives. Pathologists traditionally diagnose cancer by carefully examining histopathological slides under a microscope, but this process is time-consuming, labor-intensive, and sometimes subjective. In today鈥檚 era of Artificial Intelligence (AI), we have an extraordinary opportunity to use computers to assist doctors by analyzing medical images faster and more consistently.
This project invites first-year students to participate in an exciting and meaningful research experience where they will work with histopathological images, which are high-resolution pictures of tissue samples used to detect cancer. Students will learn how to process these images using modern AI tools, focusing on computer vision methods that allow machines to 鈥渟ee鈥 and recognize cancerous patterns. They will also explore Natural Language Processing (NLP), which is the field of AI that enables computers to generate understandable written reports from complex data. The ultimate goal is to build systems that can not only detect cancerous regions in medical images but also generate clear, human-readable summaries to support doctors and patients.
To ground the research in real-world impact, students will study multiple cancer types,
such as breast cancer, colorectal cancer, lung cancer, and prostate cancer. For example,
the BreakHis dataset provides nearly 8,000 breast cancer images across different magnifications;
the NCT-CRC-HE-100K dataset includes over 100,000 colorectal cancer images; and the
Cancer Genome Atlas (TCGA) hosts whole-slide images for lung and prostate cancers
along with clinical data. Students will work with these publicly available resources,
ensuring they gain hands-on experience with authentic medical datasets used by researchers
worldwide.
Resources will include high-performance computing systems, Python-based AI libraries
(PyTorch, TensorFlow, Scikit-learn), and specialized medical imaging tools (3D Slicer,
SimpleITK, etc.).
By the end of this project, students will not only contribute to an important area of healthcare research but also develop transferable skills in data science, programming, image analysis, and scientific communication. This research combines three of today鈥檚 most impactful areas (AI, medical image processing, and NLP), giving students a truly interdisciplinary experience at the forefront of innovation.
|
-
Students participating in this project will gain valuable technical, analytical, and
professional skills. Specifically, they will:
- Programming & Data Science: Learn the fundamentals of Python programming and how to use AI libraries (e.g., PyTorch,
TensorFlow, Scikit-learn) for data analysis.
- Medical Image Processing: Understand how to handle high-resolution histopathological images, preprocess data,
and apply computer vision techniques for cancer detection.
- Artificial Intelligence & Machine Learning: Gain exposure to supervised learning, convolutional neural networks (CNNs), and methods
for model evaluation (accuracy, sensitivity, specificity).
- Natural Language Processing (NLP): Learn how AI systems generate clinical-style text reports, bridging raw computational
output with human-friendly language.
- Ethics & Interdisciplinary Thinking: Reflect on the ethical challenges of AI in medicine, including privacy, bias, and
the importance of human oversight.
- Research Communication: Develop the ability to explain findings clearly through written reports, oral presentations,
and visual posters.
These outcomes are directly transferable to careers in healthcare, data science, computer
science, and biomedical research. More importantly, students will leave the program
with the confidence that they can tackle complex, real-world problems using computational
methods, which is an empowering outcome for first-year scholars beginning their academic
journey.
|
-
Students will engage in weekly, hands-on research activities under faculty guidance.
Typical duties will include:
- Learning and Training: Completing guided tutorials on Python, AI libraries, and image processing methods.
- Data Handling: Downloading, organizing, and preprocessing histopathological image datasets (e.g.,
resizing, normalization, and annotation review).
- AI Model Development: Training and testing machine learning models for cancer detection (e.g., distinguishing
benign vs. malignant breast tumors using BreakHis data).
- NLP Report Generation: Assisting in building simple systems that convert AI outputs into text-based summaries
of cancer findings.
- Critical Analysis: Reviewing model results, identifying errors, and discussing strategies for improvement.
- Team Meetings: Participating in weekly research group discussions to share progress, troubleshoot
challenges, and plan next steps.
- Communication: Preparing short reports or presentations summarizing weekly progress and results.
By the end of each semester, students will contribute to a collective research report and potentially present findings at KSU鈥檚 Undergraduate Research Showcase or similar venues.
|
-
Modality (Face to Face, Hybrid, Online)
-
|