Hello, and thank you for the opportunity to introduce my research experience.
1. Urban Spatial Structure Identification Based on Multi-source Data FusionJoint Training at Institute of Geographic Sciences and Natural Resources Research, CAS (Sep 2024 – Jan 2025)
This project aims to address the limitations of traditional remote sensing methods in detecting urban-rural transitional zones. Our goal was to build a nationwide urban spatial evolution map from 2012 to 2023 to support China's new-type urbanization strategy.
In this work, I collected and processed over 200GB of heterogeneous geospatial data across China, including NPP-VIIRS nighttime light data, POI distributions, NDVI, and impervious surface area (ISA), using tools such as Python, Google Earth Engine, RStudio, and ArcGIS Pro.
To integrate these diverse datasets, I used Python with GDAL for large-scale batch processing, unified spatial resolution and projection, and designed a hierarchical normalization strategy to handle scale discrepancies. For example, I applied layered threshold normalization for NTL data and kernel density estimation for POI.
I also reproduced and extended the method from the paper "An Unsupervised Urban Extent Extraction Method from NPP-VIIRS Nighttime Light Data". I implemented two key components: spatial context-constrained clustering and adaptive directional filtering. These improved the accuracy of urban extent extraction. We constructed a 12-year nationwide urban area dataset and validated it against the GAIA benchmark, achieving a Kappa coefficient between 0.76 and 0.82.
2. A BIKE Weak Key Identification Scheme Based on Ensemble LearningSubmitted to IEEE TIFS (CCF-A Level)
This project addresses the security risks posed by weak keys in the BIKE post-quantum cryptographic algorithm. Existing approaches lack systematic prevention mechanisms and underutilize the potential of AI in cryptanalysis. Our work proposes an ensemble learning-based scheme for weak key identification.
My main contributions include designing the feature engineering module. I independently implemented code to extract structural features such as inter-set distances, local similarity, and block distribution.
For feature selection, I built a hybrid pipeline combining SelectKBest and Random Forest RFE to reduce feature dimensionality and improve model performance.
Finally, I developed the BIKE-StackNetIS framework, which integrates multiple base classifiers (including KNN, SVM, AdaBoost, and Decision Trees) and uses XGBoost as a meta-learner for second-layer integration.
These two projects reflect my skills in geospatial data analysis, unsupervised learning, and applying machine learning to real-world problems. I'm passionate about combining domain knowledge with computational methods to solve complex scientific and security challenges.
1. Urban Spatial Structure Identification Based on Multi-Source Data FusionJoint Training at Institute of Geographic Sciences and Natural Resources Research, CASSep. 2024 – Jan. 2025
This project focuses on identifying urban spatial structures from 2012 to 2023 in China by combining different types of geographic data. Our goal is to support China's new urbanization strategy with better data.
- I collected and organized over 200GB of nationwide data using tools like Python, GEE, ArcGIS Pro, and RStudio. The data includes night-time light (NTL), POI, NDVI, and impervious surface area (ISA).
- I processed the data using Python and GDAL, unifying spatial resolution and coordinate systems. To solve the issue of different data scales, I designed a layered normalization strategy, such as threshold-based normalization for NTL and kernel density analysis for POI.
- I reproduced and improved a method from the paper "An Unsupervised Urban Extent Extraction Method from NPP-VIIRS Nighttime Light Data." I implemented a spatial context constrained clustering and an adaptive directional filtering algorithm to extract urban areas. The final urban boundary dataset matches well with the GAIA dataset, achieving a Kappa coefficient between 0.76 and 0.82.
2. A BIKE Weak Key Identification Scheme Based on Ensemble LearningSubmitted to TIFS (CCF-A)
This project aims to improve the security of the post-quantum cryptographic algorithm BIKE by detecting weak keys, which can cause decryption failures.
- I helped design the feature engineering module. I implemented code to extract structural features such as inter-set distance, local similarity, and block distribution.
- I built a hybrid feature selection method using SelectKBest and random forest-based RFE. This reduced the feature dimension and improved the model's accuracy.
- I implemented the BIKE-StackNetIS model. It combines base classifiers like KNN, SVM, AdaBoost, and decision trees. Their outputs are then fed into an XGBoost meta-classifier to complete the final prediction.