Título
Improving point of entry database through data science: Deep learning-based identification of unmapped roads using remote sensing images
Autor
Saraiva, Matilde Soares
Resumo
pt
Recorrendo a imagens de sat´elite e t´ecnicas de aprendizagem profunda, esta Disserta¸c˜ao
visa identificar interse¸c˜oes entre estradas e fronteiras terrestres, com vista `a automatiza
¸c˜ao do processo de atualiza¸c˜ao da base de dados de Pontos de Entrada, originalmente
desenvolvida pelo programa “COVID-19 Impact on Points of Entry” da International
Organization for Migration.
Utilizando Angola como ´area estudo, a Disserta¸c˜ao prop˜oe uma abordagem baseada
em classifica¸c˜ao de imagens. Para isso, inicialmente extra´ıram-se imagens de sat´elite do
ArcGIS Pro, em Angola, e criaram-se manualmente as legendas correspondentes. Posteriormente,
foram criados conjuntos de dados de treino e teste, com as imagens divididas em
peda¸cos de (64×64) pixels. O conjunto de teste cont´em exclusivamente imagens da zona
fronteiri¸ca de Angola, enquanto o conjunto de treino inclui imagens internas aos limites
do pa´ıs.
Criaram-se seis arquiteturas baseadas em Redes Neuronais Convolucionais (CNN) e
utilizaram-se modelos pr´e-treinados com os dados da ImageNet (MobileNetV1 e ResNet50),
com o prop´osito de investigar a melhor abordagem. V´arias experiˆencias foram desenvolvidas,
recorrendo a cada arquitetura.
O modelo que atingiu o melhor desempenho ´e baseado numa CNN personalizada,
composta por dois blocos com duas camadas convolucionais e uma camada de pooling.
Este identifica corretamente 47 Pontos de Entrada, melhorando a base de dados de 7
para 47 pontos. A integra¸c˜ao de m´etodos de ciˆencia de dados com imagens de sat´elite,
visa fornecer uma proposta mais automatizada para identificar novos Pontos de Entrada
terrestres, relevante para a capacita¸c˜ao de organiza¸c˜oes humanit´arias e governamentais
na monitoriza¸c˜ao, tomada de decis˜oes e resposta a crises.
en
This Dissertation proposes an end-to-end framework for map creation using a data science
approach to address the data gap identified in the International Organization for
Migration’s (IOM) COVID-19 Impact on Points of Entry program. Leveraging satellite
imagery and deep learning techniques, the study aims to identify relevant nodes within
complex networks of land roads and borders to augment the Points of Entry database,
focusing on Angola as a proof-of-concept area.
Initial tasks involve data collection, namely satellite imagery extraction from ArcGIS
Pro and the manual creation of corresponding ground truth data. Subsequently, train
and test datasets are prepared, with images divided into (64 × 64) pixel pieces. The test
dataset exclusively comprises data from Angola’s border area, while the training dataset
includes images from within the country’s boundaries.
The Dissertation’s training phase encompasses two main sections: Custom-Built and
Pre-Trained models. Six custom-built Convolutional Neural Network (CNN) architectures
are designed, alongside experiments using pre-trained models (MobileNetV1 and
ResNet50), pre-trained with the ImageNet dataset, with fine-tuning applied.
The best-performing model is based on a CNN, consisting of two blocks with two
convolutional and one pooling layers, correctly identifies 47 Points of Entry, enhancing
the database from 7 to 47 Points of Entry. By integrating data science methods with
satellite imagery, the study aims to provide automated mechanisms for identifying relevant
nodes in complex networks, empowering humanitarian and governmental stakeholders to
monitor, make informed decisions and respond efficiently to emerging challenges.